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METHODS FOR IMPROVING SEEDS 
5 REFERENCE TO RELATED APPLICATIONS 

This is a continuation in part of copending application USSN 08/912,272, 
which is a continuation in part of USSN 08/879,827, which is a continuation in part of 
USSN 08/700,152, filed August 20, 1996, all of which are incorporated herein by 
reference. 

10 

FIELD OF THE INVENTION 
The present invention is directed to plant genetic engineering. In 
particular, it relates to new methods for modulating mass and other properties of plant 
seeds. 

15 

BACKGROUND OF THE INVENTION 
The pattern of flower development is controlled by the floral meristem, a 
complex tissue whose cells give rise to the different organ systems of the flower. 
Genetic and molecular studies have defined an evolutionarily conserved network of genes 

20 that control floral meristem identity and floral organ development in Arabidopsis, 
snapdragon, and other plant species (see, e.g., Coen and Carpenter, Plant Cell 
5:1175-1181 (1993) and Okamuro et al. Plant Cell 5:1183-1193 (1993)). In 
Arabidopsis, a floral homeotic gene APETALA2 (AP2) controls three critical aspects of 
flower ontogeny - the establishment of the floral meristem (Irish and Sussex, Plant Cell 

25 2:741-753 (1990); Huala and Sussex, Plant Cell 4:901-913 (1992); Bowman et al., 
Development 119:721-743 (1993); Schultz and Haughn, Development 119:745-765 
(1993); Shannon and Meeks-Wagner, Plant Cell 5:639-655 (1993)), the specification of 
floral organ identity (Komaki et al, Development 104:195-203 (1988)); Bowman et al., 
Plant Cell 1:37-52 (1989); Kunst et al., Plant Cell 1:1195-1208 (1989)), and the 

30 temporal and spatial regulation of floral homeotic gene expression (Bowman et al, Plant 
Cell 3:749-758 (1991); Drews et al, Cell 65:91-1002 (1991)). 

One early function of API during flower development is to promote the 
establishment of the floral meristem. API performs this function in cooperation with at 
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least three other floral meristem genes, APETALA1 (API), LEAFY (LFY), and 
CAULIFLOWER (CAL) (Irish and Sussex (1990); Bowman, Flowering Newsletter 14:7-19 
(1992); Huala and Sussex (1992); Bowman et al., (1993); Schultz and Haughn, (1993); 
Shannon and Meeks-Wagner, (1993)). A second function of API is to regulate floral 
organ development. In Arabidopsis, the floral meristem produces four concentric rings 
or whorls of floral organs - sepals, petals, stamens, and carpels. In weak, partial 
loss-of-function ap2 mutants, sepals are homeotically transformed into leaves, and petals 
are transformed into pollen-producing stamenoid organs (Bowman et al, Development 
112:1-20 (1991)). By contrast, in strong ap2 mutants, sepals are transformed into 
ovule-bearing carpels, petal development is suppressed, the number of stamens is 
reduced, and carpel fusion is often defective (Bowman et al, (1991)). Finally, the 
effects of ap2 on floral organ development are in part a result of a third function of AP2, 
which is to directly or indirectly regulate the expression of several flower-specific 
homeotic regulatory genes (Bowman et al., Plant Cell 3:749-758 (1991); Drews et al., 
Cell 65:91-1002 (1991); Jack et al. Cell 68:683-697 (1992); Mandel et al. Cell 71: 
133-143 (1992)). 

Clearly, Ap2 plays a critical role in the regulation of Arabidopsis flower 
development. Yet, little is known about how it carries out its functions at the cellular 
and molecular levels. A spatial and combinatorial model has been proposed to explain 
the role of AP2 and other floral homeotic genes in the specification of floral organ 
identity^, e.g., Coen and Carpenter, supra). One central premise of this model is that 
AP2 and a second floral homeotic gene AGAMOUS (AG) are mutually antagonistic genes. 
That is, AP2 negatively regulates AG gene expression in sepals and petals, and 
conversely, AG negatively regulates AP2 gene expression in stamens and carpels. In situ 
hybridization analysis of AG gene expression in wild-type and ap2 mutant flowers has 
demonstrated that AP2 is indeed a negative regulator of AG expression. However, it is 
not yet known how AP2 controls AG. Nor is it known how AG influences AP2 gene 
activity. 

The AP2 gene in Arabidopsis has been isolated by T-DNA insertional 
mutagenesis as described in Jofuku et al. The Plant Cell 6:1211-1225 (1994). AP2 
encodes a putative nuclear factor that bears no significant similarity to any known fungal, 
or animal regulatory protein. Evidence provided there indicates that AP2 gene activity 
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and function are not restricted to developing flowers, suggesting that it may play a 
broader role in the regulation of Arabidopsis development than originally proposed. 

In spite of the recent progress in defining the genetic control of plant 
development, little progress has been reported in the identification and analysis of genes 
effecting agronomically important traits such as seed size, protein content, oil content 
and the like. Characterization of such genes would allow for the genetic engineering of 
plants with a variety of desirable traits. The present invention addresses these and other 
needs. 

SUMMARY OF THE INVENTION 
The present invention provides methods of modulating seed mass and other 
traits in plants. The methods involve providing a plant comprising a recombinant 
expression cassette containing an ADC nucleic acid linked to a plant promoter. The plant 
is either selfed or crossed with a second plant to produce a plurality of seeds. Seeds 
with the desired trait (e.g., altered mass) are then selected. 

In some embodiments, transcription of the ADC nucleic acid inhibits 
expression of an endogenous ADC gene or activity the encoded protein. In these 
embodiments, the step of selecting includes the step of selecting seed with increased mass 
or another trait. The seed may have, for instance, increased protein content, 
carbohydrate content, or oil content. In the case of increased oil content, the types of 
fatty acids may or may not be altered as compared to the parental lines. In these 
embodiments, the ADC nucleic acid may be linked to the plant promoter in the sense or 
the antisense orientation. Alternatively, expression of the ADC nucleic acid may enhance 
expression of an endogenous ADC gene or ADC activity and the step of selecting 
includes the step of selecting seed with decreased mass. This embodiment is particularly 
useful for producing seedless varieties of crop plants. 

If the first plant is crossed with a second plant the two plants may be the 
same or different species. The plants may be any higher plants, for example, members 
of the families Brassicaceae or Solanaceae. In making seed of the invention, either the 
female or the male parent plant can comprise the expression cassette containing the ADC 
nucleic acid. In preferred embodiments, both parents contain the expression cassette. 

In the expression cassettes, the plant promoter may be a constitutive 
promoter, for example, the CaMV 35S promoter. Alternatively, the promoter may be a 
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tissue-specific promoter. Examples of tissue specific expression useful in the invention 
include fruit-specific, seed-specific (e.g., ovule-specific, embryo-specific, endosperm- 
specific, integument-specific, or seed coat-specifiic) expression. 

The invention also provides seed produced by the methods described 
above. The seed of the invention comprise a recombinant expression cassette containing 
an ADC nucleic acid. If the expression cassette is used to inhibit expression of 
endogenous ADC expression, the seed will have a mass at least about 20% greater than 
the average mass of seeds of the same plant variety which lack the recombinant 
expression cassette. If the expression cassette is used to enhance expression of ADC, the 
seed will have a mass at least about 20% less than the average mass of seeds of the same 
plant variety which lack the recombinant expression cassette. Other traits such as protein 
content, carbohydrate content, and oil content can be altered in the same manner. 



Definitions 

The phrase "nucleic acid sequence" refers to a single or double-stranded 
polymer of deoxyribonucleotide or ribonucleotide bases read from the 5' to the 3' end. 
It includes chromosomal DNA, self-replicating plasmids, infectious polymers of DNA or 
RNA and DNA or RNA that performs a primarily structural role.. 

The term "promoter" refers to a region or sequence determinants located 
upstream or downstream from the start of transcription and which are involved in 
recognition and binding of RNA polymerase and other proteins to initiate transcription. 
A "plant promoter" is a promoter capable of initiating transcription in plant cells. 

The term "plant" includes whole plants, plant organs (e.g., leaves, stems, 
flowers, roots, etc.), seeds and plant cells and progeny of same. The class of plants 
which can be used in the method of the invention is generally as broad as the class of 
higher plants amenable to transformation techniques, including angiospenns 
(monocotyledonous and dicotyledonous plants), as well as gymnosperms. It includes 
plants of a variety of ploidy levels, including polyploid, diploid, haploid and 
hemizygous. 

A polynucleotide sequence is "heterologous to" an organism or a second 
polynucleotide sequence if it originates from a foreign species, or, if from the same 
species, is modified from its original form. For example, a promoter operably linked to 
a heterologous coding sequence refers to a coding sequence from a species different from 
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that from which the promoter was derived, or, if from the same species, a coding 
sequence which is different from any naturally occurring allelic variants. As defined 
here, a modified ADC coding sequence which is heterologous to an operably linked ADC 
promoter does not include the T-DNA insertional mutants (e.g., ap2-10) as described in 
Jofuku et al. The Plant Cell 6:1211-1225 (1994). 

A polynucleotide "exogenous to" an individual plant is a polynucleotide 
which is introduced into the plant by any means other than by a sexual cross. Examples 
of means by which this can be accomplished are described below, and include 
Agrobacterium-mediaXed transformation, biolistic methods, electroporation, and the like. 
Such a plant containing the exogenous nucleic acid is referred to here as an R t 
generation transgenic plant. Transgenic plants which arise from sexual cross or by 
selfing are descendants of such a plant. 

An "ADC (AP2 domain containing) nucleic acid" or "ADC polynucleotide 
sequence" of the invention is a subsequence or full length polynucleotide sequence of a 
gene which, encodes an polypeptide containing an AP2 domain and when present in a 
transgenic plant, can be used to modulate seed properties in seed produced by the plant. 
An exemplary nucleic acid of the invention is the Arobidopsis AP2 sequence as disclosed 
in Jofuku et al. The Plant Cell 6:1211-1225 (1994). The GenBank accession number for 
this sequence is U12546. As explained in detail below a family of RAP2 (related to API) 
genes have been identified in Arobidopsis. The class of nucleic acids claimed here falls 
into at least two subclasses (AP2-like and EREBP-like genes), which are distinguished 
by, for instance, the number of AP2 domains contained within each polypeptide and by 
sequences within certain conserved regions. The differences between these two 
subclasses are described in more detail below. ADC polynucleotides are defined by then- 
ability to hybridize under defined conditions to the exemplified nulceic acids or PCR 
products derived from them. An ADC polynucleotide (e.g., API or RAP2) is typically at 
least about 30-40 nucleotides to about 3000, usually less than about 5000 nucleotides in 
length. Usually the nucleic acids are from about 100 to about 2000 nucleotides, often 
from about 500 to about 1700 nucleotides in length. 

ADC nucleic acids, as explained in more detail below, are a new class of 
plant regulatory genes that encode ADC polypeptides, which are distinguished by the 
presence of one or more of a 56-68 amino acid repeated motif, referred to here as the 
"AP2 domain". The amino acid sequence of an exemplary AP2 polypeptide is shown in 
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Jofuku et al. , supra. One of skill will recognize that in light of the present disclosure 
various modifications {e.g. , substitutions, additions, and deletions) can be made to the 
sequences shown there without substantially affecting its function. These variations are 
specifically covered by the terms ADC polypeptide or ADC polynucleotide. 

In the case of both expression of transgenes and inhibition of endogenous 
genes (e.g., by antisense, or sense suppression) one of skill will recognize that the 
inserted polynucleotide sequence need not be identical, but may be only "substantially 
identical" to a sequence of the gene from which it was derived. As explained below, 
these substantially identical variants are specifically covered by the term ADC nucleic 
acid. 

In the case where the inserted polynucleotide sequence is transcribed and 
translated to produce a functional polypeptide, one of skill will recognize that because of 
codon degeneracy a number of polynucleotide sequences will encode the same 
polypeptide. These variants are specifically covered by the terms "ADC nucleic acid", 
"AP2 nucleic acid" and "RAP2 nucleic acid". In addition, the term specifically includes 
those full length sequences substantially identical (determined as described below) with 
an ADC polynucleotide sequence and that encode proteins that retain the function of the 
ADC polypeptide (e.g., resulting from conservative substitutions of amino acids in the 
AP2 polypeptide). In addition, variants can be those that encode dominant negative 

mutants as described below. 

Two nucleic acid sequences or polypeptides are said to be "identical" if the 
sequence of nucleotides or amino acid residues, respectively, in the two sequences is the 
same when aligned for maximum correspondence as described below. The term 
"complementary to" is used herein to mean that the complementary sequence is identical 
to all or a portion of a reference polynucleotide sequence. 

Sequence comparisons between two (or more) polynucleotides or 
polypeptides are typically performed by comparing sequences of the two sequences over 
a "comparison window" to identify and compare local regions of sequence similarity. A 
"comparison window", as used herein, refers to a segment of at least about 20 
contiguous positions, usually about 50to about 200, more usually about 100 to about 150 
in which a sequence may be compared to a reference sequence of the same number of 
contiguous positions after the two sequences are optimally aligned. 



WO 99/41974 PCT/US99/03429 

7 

Optimal alignment of sequences for comparison may be conducted by the 
local homology algorithm of Smith and Waterman Adv. Appl. Maih. 2:482 (1981), by the 
homology alignment algorithm of Needleman and Wunsch /. Mol. Biol. 48:443 (1970), 
by the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. 
(U.S.A.) 85: 2444 (1988), by computerized implementations of these algorithms (GAP, 
BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group (GCG), 575 Science Dr., Madison, WI), or by inspection. 

"Percentage of sequence identity" is determined by comparing two 
optimally aligned sequences over a comparison window, wherein the portion of the 
polynucleotide sequence in the comparison window may comprise additions or deletions 
(i.e., gaps) as compared to the reference sequence (which does not comprise additions or 
deletions) for optimal alignment of the two sequences. The percentage is calculated by 
determining the number of positions at which the identical nucleic acid base or amino 
acid residue occurs in both sequences to yield the number of matched positions, dividing 
the number of matched positions by the total number of positions in the window of 
comparison and multiplying the result by 100 to yield the percentage of sequence 
identity. 

The term "substantial identity" of polynucleotide sequences means that a 
polynucleotide comprises a sequence that has at least 60% sequence identity, preferably 
at least 80%, more preferably at least 90% and most preferably at least 95%, compared 
to a reference sequence using the programs described above (preferably BLAST) using 
standard parameters. One of skill will recognize that these values can be appropriately 
adjusted to determine corresponding identity of proteins encoded by two nucleotide 
sequences by taking into account codon degeneracy, amino acid similarity, reading frame 
positioning and the like. Substantial identity of amino acid sequences for these purposes 
normally means sequence identity of at least 35%, preferably at least 60%, more 
preferably at least 90%, and most preferably at least 95%. Polypeptides which are 
"substantially similar" share sequences as noted above except that residue positions which 
are not identical may differ by conservative amino acid changes. Conservative amino 
acid substitutions refer to the interchangeability of residues having similar side chains. 
For example, a group of amino acids having aliphatic side chains is glycine, alanine, 
valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side 
chains is serine and threonine; a group of amino acids having amide-containing side 
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chains is asparagine and glutamine; a group of amino acids having aromatic side chains 
is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side 
chains is lysine, arginine, and histidine; and a group of amino acids having sulfur- 
containing side chains is cysteine and methionine. Preferred conservative amino acids 
substitution groups are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysme-arginine, 
alanine-valine, and asparagine-glutamine. 

Another indication that nucleotide sequences are substantially identical is if 
two molecules hybridize to each other, or a third nucleic acid, under stringent conditions. 
Stringent conditions are sequence dependent and will be different in different 
circumstances. Generally, stringent conditions are selected to be about 5° C lower than 
the thermal melting point (Tm) for the specific sequence at a defined ionic strength and 
pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfectly matched probe. Typically, stringent 
conditions will be those in which the salt concentration is about 0.02 molar at pH 7 and 
the temperature is at least about 60°C. 

In the present invention, genomic DNA or cDNA comprising ADC nucleic 
acids of the invention can be identified in standard Southern blots under stringent 
conditions using the nucleic acid sequences disclosed here. For the purposes of this 
disclosure, stringent conditions for such hybridizations are those which include at least 
one wash in 0.2X SSC at a temperature of at least about 50°C, usually about 55°C to 
about 60°C, for 20 minutes, or equivalent conditions. Other means by which nucleic 
acids of the invention can be identified are described in more detail below. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1A shows amino acid sequence alignment between AP2 direct 
repeats AP2-R1 (aa 129-195) and AP2-R2 (aa 221-288). Solid and dashed lines between 
the two sequences indicate residue identity and similarity, respectively. Arrows indicate 
the positions of the ap2-l, ap2-5, and ap2-10 mutations described in Jofuku et al (1994). 
The bracket above the AP2-R1 and AP2-R2 sequences indicates the residues capable of 
forming amphipathic a-helices shown in Figure IB. 

Figure IB is a schematic diagram of the putative AP2-R1 (Rl) and AP2- 
R2 (R2) amphipathic a-helices. The NH2 terminal ends of the Rl and R2 helices begin 
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at residues Phe-160 and Phe-253 and rotate clockwise by 100° per residue through 
Phe-177 and Cys-270, respectively. Arrows directed toward or away from the center of 
the helical wheel diagrams indicate the negative or positive degree of hydrophobicity as 
defined by Jones et al. J. Lipid Res. 33: 87-296 (1992). 

Figure 2 shows an antisense construct of the invention. pPW14.4 (which 
is identical to pPW15) represents the 13.41 kb AP2 antisense gene construct used in 
plant transformation described here. pPW14.4 is comprised of the AP2 gene coding 
region in a transcriptional fusion with the cauliflower mosaic virus 35S (P35S) 
constitutive promoter in an antisense orientation. The Ti plasmid vector used is a 
modified version of the pGSJ780A vector (Plant Genetic Systems, Gent, Belgium) in 
which a unique EcoRl restriction site was introduced into the BamHl site using a Clal- 
EcoRl-BamHl adaptor. The modified pGSJ780A vector DNA was linearized with 
EcoRl and the AP2 coding region inserted as a 1.68 kb EcoRl DNA fragment from AP2 
cDNA plasmid cAP2#l (Jofuku et al., 1994) in an antisense orientation with respect to 
the 35S promoter. KmR represents the plant selectable marker gene NPTII which 
confers resistance to the antibiotic kanamycin to transformed plant cells carrying an 
integrated 35S-AP2 antisense gene. Boxes 1 and 5 represent the T-DNA left and right 
border sequences, respectively, that are required for transfer of T-DNA containing the 
35S-AP2 antisense gene construct into the plant genome. Regions 2 and 3 contain T- 
DNA sequences. Box 3 designates the 3' octopine synthase gene sequences that function 
in transcriptional termination. Region 6 designates bacterial DNA sequences that 
function as a bacterial origin of replication in both E . coli and Agrobacterium 
tumefaciens, thus allowing pPW14.4 plasmid replication and retention in both bacteria. 
Box 7 represents the bacterial selectable marker gene that confers resistance to the 
antibiotics streptomycin and spectinomycin and allows for selection of Agrobacterium 
strains that carry the pPW14.4 recombinant plasmid. 

Figure 3 shows a sense construct of the invention. pPW12.4 (which is 
identical to pPW9) represents the 13.41 kb AP2 sense gene construct used in plant 
transformation described here. pPW12.4 is comprised of the AP2 gene coding region in 
a transcriptional fusion with the cauliflower mosaic virus 35S (P35S) constitutive 
promoter in a sense orientation. The Ti plasmid vector used is a modified version of the 
pGSJ780A vector (Plant Genetic Systems, Gent, Belgium) in which a unique EcoRl 
restriction site was introduced into the BamHl site using a Clal-EcoRl-BamHl adaptor. 
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The modified pGSJ780A vector DNA was linearized with EcoRl and the AP2 coding 
region inserted as a 1.68 kb EcoRl DNA fragment from AP2 cDNA plasmid cAP2#l 
(Jofuku et al., 1994) in a sense orientation with respect to the 35S promoter. KmR 
represents the plant selectable marker gene NPTII which confers resistance to the 
antibiotic kanamycin to transformed plant cells carrying an integrated 35S-AP2 antisense 
gene. Boxes 1 and 5 represent the T-DNA left and right border sequences, respectively, 
that are required for transfer of T-DNA containing the 35S-AP2 sense gene construct into 
the plant genome. Regions 2 and 3 contain T-DNA sequences. Box 3 designates the 3' 
octopine synthase gene sequences that function in transcriptional termination. Region 6 
designates bacterial DNA sequences that function as a bacterial origin of replication in 
both E. coli and Agrobacterium tumefaciens, thus allowing pPW12.4 plasmid replication 
and retention in both bacteria. Box 7 represents the bacterial selectable marker gene that 
confers resistance to the antibiotics streptomycin and spectinomycin and allows for 
selection of Agrobacterium strains that carry the pPW12.4 recombinant plasmid. 

Figures 4 A and 4B show AP2 domain sequence and structure. The 
number of amino acid residues within each AP2 domain is shown to the right. Sequence 
gaps were introduced to maximize sequence alignments. The position of amino acid 
residues and sequence gaps within the AP2 domain alignments are numbered 1-77 for 
reference. The location of the conserved YRG and RAYD elements are indicated by 
brackets. Shaded boxes highlight regions of sequence shnilarity. Positively charged 
amino acids within the YRG element are indicated by + signs above the residues. The 
location of the 18-amino acid core region that is predicted to form an amphipathic a- 
helix in AP2 is indicated by a bracket. Residues within the RAYD element of each AP2 
domain that are predicted to form an amphipathic a-helix are underlined. Figure 4A 
shows members of the AP2-like subclass. Amino acid sequence alignment between the 
AP2 domain repeats Rl and R2 contained within AP2, ANT and RAP2.7 is shown. 
Brackets above the sequences designate the conserved YRG and RAYD blocks described 
above. The filled circle and asterisk indicate the positions of the ap2-l, and ap2-5 
mutations, respectively. Amino acid residues that constitute a consensus AP2 domain 
motif for AP2, ANT, and RAP2.7 is shown below the alignment with invariant residues 
shown capitalized. Figure 4B shows members of the EREBP-like subclass. Amino acid 
sequence alignment between the AP2 domains contained within the tobacco EREBPs and 
the Arabidopsis EREBP-like RAP2 proteins is shown. GenBank accession numbers for 
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EREBP-1, EREBP-2, EREBP-3, and EREBP-4 are D38123, D38126, D38124, and 
D38125, respectively. 

Figure 4C provides schematic diagrams of the putative RAP2.7-R1, AP2- 
Rl, and ANT-R1 amphipathic a-helices. Amino acid residues within the RAP2.7-R1, 
AP2-R1, and ANT-R1 motifs shown underlined in A that are predicted to form 
amphipathic a-helices are schematically displayed with residues rotating clockwise by 
100° per residue to form helical structures. Arrows directed toward or away from the 
center of the helical wheel diagrams indicate the negative or positive degree of 
hydrophobic^ as defined by Jones et al. J. Lipid Res. 33:287-296 (1992). Positively 
and negatively charged amino acid residues are designated by + and - signs, 
respectively. 

Figure 4D shows schematic diagrams of the putative RAP2.2, RAP2.5, 
RAP2-12, and EREBP-3 amphipathic a-helices. Amino acid residues within the 
RAP2.2, RAP2.5, RAP2-12, and EREBP-3 motifs shown underlined in Figure 4B that 
are predicted to form amphipathic a-helices are schematically displayed as described in 
Figure 4C. 

Figure 4E shows sequence alignment between the 25-26 amino acid linker 
regions in AP2, ANT, and RAP2.7. Rl and R2 designate the positions of the Rl and R2 
repeats within AP2, ANT, and RAP2.7 relative to the linker region sequences. Boxes 
designate invariant residues within the conserved linker regions. Amino acid residues 
that constitute a consensus linker region motif for AP2, ANT, and RAP2.7 are shown 
below the alignment with invariant residues shown capitalized. The arrowhead indicates 
the position of the ant-3 mutation described by Klucher et al. Plant Cell 8:137-153 
(1996). 

Figure 5 is a schematic diagram of pAP2, which can be used to construct 
expression vectors of the invention. 

Figure 6 is a schematic diagram of pBELl, which can be used to construct 

expression vectors of the invention. 

Figure 7 is schematic diagram of gene expression. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 
This invention relates to plant ADC genes, such as the AP2 and RAP2 
genes of Arabidopsis. The invention provides molecular strategies for controlling seed 
size and total seed protein using ADC overexpression and antisense gene constructs. In 
particular, transgenic plants containing antisense constructs have dramatically increased 
seed mass, seed protein, or seed oil. Alternatively, overexpression of ADC using a 
constructs of the invention leads to reduced seed size and total seed protein. Together, 
data presented here demonstrate that a number of agronomically important traits 
including seed mass, total seed protein, and oil content, can be controlled in species of 
agricultural importance. 

Isolation of ADC nucleic acids 

Generally, the nomenclature and the laboratory procedures in recombinant 
DNA technology described below are those well known and commonly employed in the 
art. Standard techniques are used for cloning, DNA and RNA isolation, amplification 
and purification. Generally enzymatic reactions involving DNA ligase, DNA 
polymerase, restriction endonucleases and the like are performed according to the 
manufacturer's specifications. These techniques and various other techniques are 
generally performed according to Sambrook et al, Molecular Cloning - A Laboratory 
Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989). 

The isolation of ADC nucleic acids may be accomplished by a number of 
techniques. For instance, oligonucleotide probes based on the sequences disclosed here 
can be used to identify the desired gene in a cDNA or genomic DNA library. To 
construct genomic libraries, large segments of genomic DNA are generated by random 
fragmentation, e.g. using restriction endonucleases, and are ligated with vector DNA to 
form concatemers that can be packaged into the appropriate vector. To prepare a cDNA 
library, mRNA is isolated from the desired organ, such as flowers, and a cDNA library 
which contains the ADC gene transcript is prepared from the mRNA. Alternatively, 
cDNA may be prepared from mRNA extracted from other tissues in which ADC genes or 

homologs are expressed. 

The cDNA or genomic library can then be screened using a probe based 
upon the sequence of a cloned ADC gene disclosed here. Probes may be used to 
hybridize with genomic DNA or cDNA sequences to isolate homologous genes in the 
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same or different plant species. Alternatively, antibodies raised against an ADC 
polypeptide can be used to screen an mRNA expression library. 

Alternatively, the nucleic acids of interest can be amplified from nucleic 
acid samples using amplification techniques. For instance, polymerase chain reaction 
5 (PCR) technology can be used to amplify the sequences of the ADC genes directly from 
genomic DNA, from cDNA, from genomic libraries or cDNA libraries. PCR and other 
in vitro amplification methods may also be useful, for example, to clone nucleic acid 
sequences that code for proteins to be expressed, to make nucleic acids to use as probes 
for detecting the presence of the desired mRNA in samples, for nucleic acid sequencing, 

10 or for other purposes. 

Appropriate primers and probes for identifying ADC sequences from plant 
tissues are generated from comparisons of the sequences provided in Jofuku et al., supra. 
For a general overview of PCR see PCR Protocols: A Guide to Methods and 
Applications. (Innis, M, Gelfand, D., Sninsky, J. and White, T., eds.), Academic Press, 

15 San Diego (1990). 

As noted above, the nucleic acids of the invention are characterized by the 
presence of sequence encoding a AP2 domain. Thus, these nucleic acids can be 
identified by their ability to specifically hybridize to sequences encoding AP2 domain 
disclosed here. Primers which specifically amplify AP2 domains of the exemplified 

20 genes are particularly useful for identification of particular ADC polynucleotides. 

Primers suitable for this purpose based on the sequence of RAP2 genes disclosed here 
are as follows: 
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Name 


GenBank 
Number 


Primers 




AP2 


U12546 


JOAP2U 5 ' -GTTGCCGCTGCCGTAGTG-3 ' 

JO AP2L 5 ' -GGTTC ATCCTGAGCCGC AT ATC-3 ' 


5 


RAP2.1 


AF003094 


JORAP2. 1U 5'-CTCAAGAAGAAGTGCCTAACCACG-3 ' 
JORAP2. 1L 5'-GCAGAAGCTAGAAGAGCGTCGA-3 ' 




RAP2.2 


AF003095 


JORAP2.2U 5 ' -GGA A A ATGGGCTGCGG AG-3 ' 
JORAP2.2L5'-GTTACCTCCAGCATCGAACGAG-3' 


10 


RAP2.4 


AF003097 


JORAP2.4U 5'-GCTGGATCTTGTTTCGCTTACG-3' 
JORAP2 . 4L 5 ' -GCTTC A AGCTTAGCGTCG ACTG-3 ' 




RAP2.5 


AF003098 


JORAP2.5U 5'-AGATGGGCTTGAAACCCGAC-3' 
JORAP2.5L 5 ' -CTGGCT AGGGCTACGCGC-3 ' 




RAP2.6 


AF003099 


JORAP2.6U 5'-TTCTTTGCCTCCTCAACCATTG-3' 
JORAP2.6L S'-TCTGAGTTCCAACA'rrn CGGG-3' 


15 


RAP2.7 


AF003100 


JORAP2.7U 5'-GAAATTGGTAACTCCGGTTCCG-3 ' 
JORAP2.7L 5'-CCATTTTGCTTTGGCGCATTAC-3 ' 




RAP2.8 


AF003101 


JORAP2i8U S'-GGCGTTACGCCTCTACCGG-S' 
JORAP2.8L 5'-CGCCGTCTTCCAGAACGTTC-3' 


20 


RAP2.9 


AF003102 


JORAP2.9U 5 ' - ATC ACGGATCTGGCTTGGTTC-3 ' 
JORAP2.9L 5'-GCCTTCTTCCGTATCAACGTCG-3' 




RAP2.10 


AF003103 


JORAP2.10U 5 , -GTCAACTCCGGCGGTTACG-3 , 
JORAP2. 10L 5'-TCTCCTTATATACGCCGCCGA-3' 




RAP2.11 


AF003104 


JORAP2.11U 5'-GAGAAGAGCAAAGGCAACAAGAC-3 ' 
JORAP2. 1 1L 5'-AGTTGTTAGGAAAATGGTTTGCG-3' 


25 


RAP2.12 


AF003105 


JORAP2. 12U 5 ' - A A ACC ATTCGTTTTC ACTTCG ACTC-3 ' 
JORAP2. 12LT 5'-TCACAGAGCGTTTCTGAGAATTAGC-3 
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The PCR primers are used under standard PCR conditions (described for 
instance in Innis et al.) using the nucleic acids as identified in the above GenBank 
accessions as a template. The PCR products generated by any of the reactions can then 
be used to identify nucleic acids of the invention {e.g., from a cDNA library) by their 
ability to hybridize to these products. Particularly preferred hybridization conditions use 
a Hybridization Buffer consisting of: 0.25M Phosphate Buffer (pH 7.2), 1 mM EDTA, 
1% Bovine Serum Albumin, 7% SDS. Hybridizations then followed by a first wash 
with 2.0XSSC +0.1% SDS or 0.39M Na+ and subsequent washes with 0.2XSSC + 
0.1% SDS or 0.042M Na+. Hybridization temperature will be from about 45 °C to 
about 78°C, usually from about 50°C to about 70°C. Followed by washes at 18°C. 
Particularly preferred hybridization conditions are as follows: 
Hybridization Temp. Hybrid. Time Wash Buffer A Wash Buffer B 



78 degrees C 
70 degrees C 
65 degrees C 
60 degrees C 
55 degrees C 
45 degrees C 



48hrs 
48hrs 
48hrs 
72hrs 
96 bxs 
200 hrs 



18 degrees C 
18 degrees C 
18 degrees C 
18 degrees C 
18 degrees C 
18 degrees C 



18 degrees C 
18 degrees C 
18 degrees C 
18 degrees C 
18 degrees C 
No wash 



20 If desired, primers that amplify regions more specific to particular ADC 

genes can be used. The PCR products produced by these primers can be used in the 
hybridization conditions described above to isolate nucleic acids of the invention. 



Name 


GenBank 
Number 


Primers 


AP2 


U12546 


AP2U 5'-ATGTGGGATCTAAACGACGCAC-3' 
AP2L 5'-GATCTTGGTCCACGCCGAC-3' 


RAP2.1 


AF003094 


RAP2.1U 5'-AAG AGG ACC ATC TCT CAG-3' 
RAP2.1L 5'-AAC ACT CGC TAG CTT CTC-3* 


RAP2.2 


AF003095 


RAP2.2U 5'-TGG TTC AGC AGC CAA CAC-3* 
RAP2.2L 5'-CAA TGC ATA GAG CTT GAG G-3' 



25 



30 
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RAP2.4 


AF003097 


RAP2.4U 5'-ACG GAT TTC ACA TCG GAG-3' 
RAP2.4L 5*-CTA AGC TAG AAT CGA ATC C-3* 


RAP2.5 


AF003098 


RAP2.5U 5'-TACCGGTTTCGCGCGTAG-3 ' 
RAP2.5L 5'-CACCTTCGAAATCAACGACCG-3' 


RAP2.6 


AF003099 


RAP2.6U 5'-TTCCCCGAAAATGTTGGAACTC-3' 
RAP2.6L 5'-TGGGAGAGAAAAAATTGGTAGATCG-3' 


RAP2.7 


AF003100 1 


RAP2.7U 5'- CGA TGG AGA CGA AGA CTC-3' 
RAP2.7L 5'- GTC GGA ACC GGA GTT ACC-3' 


RAP2.8 


AF003101 


RAP2.8U 5'-TCA CTC AAA GGC CGA GAT C-3' 
RAP2.8L 5'-TAA CAA CAT CAC CGG CTC G-3' 


RAP2.9 


AF003102 


RAP2.9U 5'-GTG AAG GCT TAG GAG GAG-3' 
RAP2.9L 5'-TGC CTC ATA TGA GTC AGA G-3' 


RAP2.10 


AF003103 


RAP2.10U 5 ' -TCCCGG AGCTTTT AGCCG-3 ' 
RAP2.10L 5'-CAACCCGTTCCAACGATCC-3' 


RAP2.11 


AF003104 


RAP2.11U 5 ' -TTCTTC ACC AGA AGC AGAGC ATG-3 ' 
RAP2. 11L 5'-CTCCATTCATTGCATATAGGGACG-3' 


RAP2.12 


AF0O3105 


RAP2. 12U 5 * -GCTTTGGTTC AGA ACTCG A ACATC-3 ' 
RAP2.12L 5 ' -AGGTTG ATAAACG AACG ATGCG-3 ' 



Polynucleotides may also be synthesized by well-known techniques as 
described in the technical literature. See, e.g., Carruthers et al., Cold Spring Harbor 
Symp. Quant. Biol. 47:411-418 (1982), and Adams et al., J. Am. Chem. Soc. 105:661 
(1983). Double stranded DNA fragments may then be obtained either by synthesizing 
the complementary strand and annealing the strands together under appropriate 
conditions, or by adding the complementary strand using DNA polymerase with an 
appropriate primer sequence. 

Alternatively, primers that specifically hybridize to highly conserved 
regions in AP2 domains can be used to amplify sequences from widely divergent plant 
species such as Arabidopsis, canola, soybean, tobacco, and snapdragon. Examples of 
such primers are as follows: 
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Primer RISZU 1: 5'-GGAYTGTGGGAAACAAGTTTA-3' 
Primer RISZU 2: 5 ' -TGC A A AGTRAC ACCTCT ATACTT-3 ' 

Y = pyrimidine (T or C) 
R = purine (A or G). 

Standard nucleic acid hybridization techniques using the conditions 
disclosed above can then be used to identify full length cDNA or genomic clones. 

In addition, the following DNA primers, RISZU 3 and RISZU 4, can be 
used in an inverse PCR reaction to specifically amplify flanking AP2 gene sequences 
from widely divergent plant species. These primers are as follows: 
Primer RISZU 3: 5 '-GCATGWGCAGTGTCAAATCCA-3 ' 
Primer RISZU 4: 5 ' -GAGG A AGTTCV A AGT ATAGA-3 ' 
W = A or T 

V = G, A, or C 

These primers have been used in standard PCR conditions to amplify ADC 
gene sequences from canola (SEQ ID NO:l) and soybean (SEQ ID NO:2). 

Control of ADC activity or gene expression 

One of skill will recognize that a number of methods can be used to 
modulate ADC activity or gene expression. ADC activity can be modulated in the plant 
cell at the gene, transcriptional, posttranscriptional, translational, or posttranslational, 
levels as schematically shown in Figure 7. Techniques for modulating ADC activity at 
each of these levels are generally weU known to one of skill and are discussed briefly 
below. 

Methods for introducing genetic mutations into plant genes are well 
known. For instance, seeds or other plant material can be treated with a mutagenic 
chemical substance, according to standard techniques. Such chemical substances include, 
but are not limited to, the following: diethyl sulfate, ethylene imine, ethyl 
methanesulfonate and N-nitroso-N-ethylurea. Alternatively, ionizing radiation from 
sources such as, for example, X-rays or gamma rays can be used. Desired mutants are 
selected by assaying for increased seed mass, oil content and other properties. 

Alternatively, homologous recombination can be used to induce targeted 
gene disruptions by specifically deleting or altering the ADC gene in vivo (see, generally, 



WO 99/41974 PCT/US99/03429 

18 

Grewal and Klar, Genetics 146: 1221-1238 (1997) and Xu et al, Genes Dev. 10: 2411- 
2422 (1996)). Homologous recombination has been demonstrated in plants (Puchta et 
al, Experientia 50: 277-284 (1994), Swoboda et al, EMBO J. 13: 484-489 (1994); and 
Offringa et al, Proc. Natl. Acad. Sci. USA 90: 7346-7350 (1993)). 

In applying homologous recombination technology to the genes of the 
invention, mutations in selected portions of an ADC gene sequences (including 5' 
upstream, 3' downstream, and intragenic regions) such as those disclosed here are made 
in vitro and then introduced into the desired plant using standard techniques. Since the 
efficiency of homologous recombination is known to be dependent on the vectors used, 
use of dicistronic gene targeting vectors as described by Mountford et al. Proc. Natl. 
Acad. Sci. USA 91: 4303-4307 (1994); and Vaulont et al. Transgenic Res. 4: 247-255 
(1995) are conveniently used to increase the efficiency of selecting for altered ADC gene 
expression in transgenic plants. The mutated gene will interact with the target wild-type 
gene in such a way that homologous recombination and targeted replacement of the wild- 
type gene will occur in transgenic plant cells, resulting in suppression of ADC activity. 

Alternatively, oligonucleotides composed of a contiguous stretch of RNA 
and DNA residues in a duplex conformation with double hairpin caps on the ends can be 
used. The RNA/DNA sequence is designed to align with the sequence of the target ADC 
gene and to contain the desired nucleotide change. Introduction of the chimeric 
oligonucleotide on an extrachromosomal T-DNA plasmid results in efficient and specific 
ADC gene conversion directed by chimeric molecules in a small number of transformed 
plant cells. This method is described in Cole-Strauss et al. Science 273:1386-1389 
(1996) and Yoon et al. Proc. Natl. Acad. Sci. USA 93: 2071-2076 (1996). 

Gene expression can be inactivated using recombinant DNA techniques by 
transforming plant cells with constructs comprising transposons or T-DNA sequences. 
ADC mutants prepared by these methods are identified according to standard techniques. 
For instance, mutants can be detected by PCR or by detecting the presence or absence of 
ADC mRNA, e.g., by Northern blots. Mutants can also be selected by assaying for 
increased seed mass, oil content and other properties. 

The isolated nucleic acid'sequences prepared as described herein, can also 
be used in a number of techniques to control endogenous ADC gene expression at various 
levels. Subequences from the sequences disclosed here can be used to control, 
transcription, RNA accumulation, translation, and the like. 
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A number of methods can be used to inhibit gene expression in plants. 
For instance, antisense technology can be conveniently used. To accomplish this, a 
nucleic acid segment from the desired gene is cloned and operably linked to a promoter 
such that the antisense strand of RNA will be transcribed. The construct is then 
transformed into plants and the antisense strand of RNA is produced. In plant cells, it 
has been suggested that antisense suppression can act at all levels of gene regulation 
including suppression of RNA translation {see, Bourque Plant Sci. (Limerick) 105: 125- 
149 (1995); Pantopoulos In Progress in Nucleic Acid Research and Molecular Biology, 
Vol. 48. Cohn, W. E. and K. Moldave (Ed.). Academic Press, Inc.: San Diego, 
California, USA; London, England, UK. p. 181-238; Heiser et al. Plant Sci. (Shannon) 
127: 61-69 (1997)) and by preventing the accumulation of mRNA which encodes the 
protein of interest, (see, Baulcombe Plant Mol. Bio. 32:79-88 (1996); Prins and 
Goldbach Arch. Virol. 141: 2259-2276 (1996); Metzlaff et al. Cell 88: 845-854 (1997), 
Sheehy et al., Proc. Nat. Acad. Sci. USA, 85:8805-8809 (1988), and Hiatt et al., U.S. 

Patent No. 4,801,340). 

The nucleic acid segment to be introduced generally will be substantially 
identical to at least a portion of the endogenous ADC gene or genes to be repressed. The 
sequence, however, need not be perfectly identical to inhibit expression. The vectors of 
the present invention can be designed such that the inhibitory effect applies to other 
genes within a family of genes exhibiting homology or substantial homology to the target 
gene. 

For antisense suppression, the introduced sequence also need not be full 
length relative to either the primary transcription product or fully processed mRNA. 
Generally, higher homology can be used to compensate for the use of a shorter sequence. 
Furthermore, the introduced sequence need not have the same intron or exon pattern, and 
homology of non-coding segments may be equally effective. Normally, a sequence of 
between about 30 or 40 nucleotides and about full length nucleotides should be used, 
though a sequence of at least about 100 nucleotides is preferred, a sequence of at least 
about 200 nucleotides is more preferred, and a sequence of about 500 to about 1700 
nucleotides is especially preferred. - 

A number of gene regions can be targetted to suppress ADC gene 
expression. The targets can include, for instance, the coding regions (e.g., regions 
flanking the PA2 domains), introns, sequences from exon/intron junctions, 5' or 3' 
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untranslated regions, and the like. In some embodiments, the constructs can be designed 
to eliminate the ability of regulatory proteins to bind to ADC gene sequences that are 
required for its cell- and/or tissue-specific expression. Such transcriptional regulatory 
sequences can be located either 5*-, 3'-, or within the coding region of the gene and can 
be either promote (positive regulatory element) or repress (negative regulatory element) 
gene transcription. These sequences can be identified using standard deletion analysis, 
well known to those of skill in the art. Once the sequences are identified, an antisense 
construct targeting these sequences is introduced into plants to control AP2 gene 
transcription in particular tissue, for instance, in developing ovules and/or seed. 

Oligonucleotide-based triple-helix formation can be used to disrupt ADC 
gene expression. Triplex DNA can inhibit DNA transcription and replication, generate 
site-specific mutations, cleave DNA, and induce homologous recombination (see, e.g., 
Havre and Glazer /. Virology 67:7324-7331 (1993); Scanlon et al FASEB J. 9:1288- 
1296 (1995); Giovannangeli et al Biochemistry 35:10539-10548 (1996); Chan and Glazer 
/. Mol. Medicine (Berlin) 75: 267-282 (1997)). Triple helix DNAs can be used to target 
the same sequences identified for antisense regulation. 

Catalytic RNA molecules or ribozymes can also be used to inhibit 
expression of ADC genes. It is possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester backbone at a specific location, 
thereby functionally inactivating the target RNA. In carrying out this cleavage, the 
ribozyme is not itself altered, and is thus capable of recycling and cleaving other 
molecules, making it a true enzyme. The inclusion of ribozyme sequences within 
antisense RNAs confers RNA-cleaving activity upon them, thereby increasing the activity 
of the constructs. Thus, ribozymes can be used to target the same sequences identified 

for antisense regulation. 

A number of classes of ribozymes have been identified. One class of 
ribozymes is derived from a number of small circular RNAs which are capable of self- 
cleavage and replication in plants. The RNAs replicate either alone (viroid RNAs) or 
with a helper virus (satellite RNAs). Examples include RNAs from avocado sunblotch 
viroid and the satellite RNAs from tobacco ringspot virus, lucerne transient streak virus, 
velvet tobacco mottle virus, solanum nodiflorum mottle virus and subterranean clover 
mottle virus. The design and use of target RNA-specific ribozymes is described in Zhao 
and Pick Nature 365:448-451 (1993); Eastham and Ahlering /. Urology 156:1186-1188 
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(1996); Sokol and Murray Transgenic Res. 5:363-371 (1996); Sun et al. Mol. 
Biotechnology 7:241-251 (1997); and Haseloff et al. Nature, 334:585-591 (1988). 

Another method of suppression is sense cosuppression. Introduction of 
nucleic acid configured in the sense orientation has been recently shown to be an 
effective means by which to block the transcription of target genes. For an example of 
the use of this method to modulate expression of endogenous genes (see, Assaad et al. 
Plant Mol. Bio. 22: 1067-1085 (1993); Flavell Proc. Natl. Acad. Sci. USA 91: 3490- 
3496 (1994); Stam et al. Annals Bot. 79: 3-12 (1997); Napoli et al., The Plant Cell 
2:279-289 (1990); and U.S. Patents Nos. 5,034,323, 5,231,020, and 5,283,184). 

The suppressive effect may occur where the introduced sequence contains 
no coding sequence per se, but only intron or untranslated sequences homologous to 
sequences present in the primary transcript of the endogenous sequence. The introduced 
sequence generally will be substantially identical to the endogenous sequence intended to 
be repressed. This minimal identity will typically be greater than about 65%, but a 
higher identity might exert a more effective repression of expression of the endogenous 
sequences. Substantially greater identity of more than about 80% is preferred, though 
about 95% to absolute identity would be most preferred. As with antisense regulation, 
the effect should apply to any other proteins within a similar family of genes exhibiting 
homology or substantial homology. 

For sense suppression, the introduced sequence, needing less than absolute 
identity, also need not be full length, relative to either the primary transcription product 
or fully processed mRNA. This may be preferred to avoid concurrent production of 
some plants which are overexpressers. A higher identity in a shorter than full length 
sequence compensates for a longer, less identical sequence. Furthermore, the introduced 
sequence need not have the same intron or exon pattern, and identity of non-coding 
segments will be equally effective. Normally, a sequence of the size ranges noted above 
for antisense regulation is used. In addition, the same gene regions noted for antisense 
regulation can be targetted using cosuppression technologies. 

Alternatively, ADC activity may be modulated by eliminating the proteins 
that are required for ADC cell-specific gene expression. Thus, expression of regulatory 
proteins and/or the sequences that control ADC gene expression can be modulated using 
the methods described here. 
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Another method is use of engineered tRNA suppression of ADC mRNA 
translation. This method involves the use of suppressor tRNAs to transactivate target 
genes containing premature stop codons (see, Betzner et al. Plant 7.11:587-595 (1997); 
and Choisne et al. Plant /.ll: 597-604 (1997). A plant line containing a constitutively 
expressed ADC gene that contains an amber stop codon is first created. Multiple lines of 
plants, each containing tRNA suppressor gene constructs under the direction of cell-type 
specific promoters are also generated. The tRNA gene construct is then crossed into the 
ADC line to activate ADC activity in a targeted manner. These tRNA suppressor lines 
could also be used to target the expression of any type of gene to the same cell or tissue 
types. 

Some ADC proteins (e.g., AP2) are believed to form multimers in vivo. 
As a result, an alternative method for inhibiting ADC function is through use of 
dominant negative mutants. This approach involves transformation of plants with 
constructs encoding mutant ADC polypeptides that form defective multimers with 
endogenous wild-type ADC proteins and thereby inactivate the protein. The mutant 
polypeptide may vary from the naturally occurring sequence at the primary structure 
level by amino acid substitutions, additions, deletions, and the like. These modifications 
can be used in a number of combinations to produce the final modified protein chain. 
Use of dominant negative mutants to inactivate target genes is described in Mizukami et 
al. Plant Cell 8:831-845 (1996). DNA sequence analysis and DNA binding studies 
strongly suggests that AP2 (Jofuku et al, Plant Cell 6: 1211-1225 (1994); and several 
RAP2s function as transcription factors. Thus, dormnant-negative forms of ADC genes 
that are defective in their abilities to bind to DNA can also be used. 

The AP2 protein is thought to exist in both a phosphorylated and a 
nonphosphorylated form. Thus AP2 activity may also be regulated by protein kinase 
signal transduction cascades. In addition, RAP2 gene activity may also be regulated by 
and/or play a role in protein kinase signal transduction cascades (EREBPs, Ohme-Takagi 
and Shinshi Plant Cell 7: 173-182 (1995); AtEBP, Buttner and Singh Proc. Natl. 
Acad. Sci. USA 94: 5961-5966 (1997); Pti4/5/6, Zhou et al. EMBOJ. 16: 3207-3218 
(1997)). Thus, mutant forms of the ADC proteins used in dorninant negative strategies 
can include substitutions at amino acid residues targeted for phosphorylation so as to 
decrease phosphorylation of the protein. Alternatively, the mutant ADC forms can be 
designed so that they are hyperphosphorylated. 
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Glycosylation events are known to affect protein activity in a cell- and/or 
tissue-specific manner {see, Meshi and Iwabuchi Plant Cell Physiol. 36: 1405-1420 
(1995); Meynial-Salles and Combes J. Biotech. 46: 1-14 (1996)). Thus, mutant forms 
of the ADC proteins can also include those in which amino acid residues that are targeted 
for glycosylation are altered in the same manner as that described for phosphorylation 
mutants. 

AP2 may carry out some of its functions through its interactions with other 
transcription factors/proteins {e.g., AINTEGUMENTA, Elliott et al. Plant Cell 8: 155- 
168 (1996); Klucher et al. Plant Cell 8: 137-153 (1996); CURLY LEAF, Goodrich et 
al. Nature (London) 386: 44-51 (1997); or LEUNIG, Liu and Meyerowitz Development 
121: 975-991 (1995). Thus, one simple method for suppressing ADC activity is to 
suppress the activities of proteins that are required for AP2 activity. ADC activity can 
thus be controlled by "titrating n out transcription factors/proteins required for ADC 
activity. This can be done by overexpressing domains ADC proteins that are involved in 
protein:protein interactions in plant cells {e.g., AP2 domains or the putative 
transcriptional activation domain as described in Jofuku et al., Plant Cell 6: 1211-1225 
(1994)). This strategy has been used to modulate gene activity (Lee et al, Exptl. Cell 
Res. 234: 270-276 (1997); Thiesen Gene Expression 5: 229-243 (1996); and Waterman et 
al., Cancer Res. 56:158-163 (1996)). 

Another strategy to affect the ability of an ADC protein to interact with 
itself or with other proteins involves the use of antibodies specific to ADC. In this 
method cell-specific expression of AP2-specific Abs is used inactivate functional domains 
through antibody.antigen recognition {see, Hupp et al. Cell 83:237-245 (1995)). 

Use of nucleic acids of the invention to enhance ADC pene expression 

Isolated sequences prepared as described herein can also be used to 
introduce expression of a particular ADC nucleic acid to enhance or increase endogenous 
gene expression. Enhanced expression will generally lead to smaller seeds or seedless 
fruit. Where overexpression of a gene is desired, the desired gene from a different 
species may be used to decrease potential sense suppression effects. 

One of skill will recognize that the polypeptides encoded by the genes of 
the invention, like other proteins, have different domains which perform different 
functions. Thus, the gene sequences need not be full length, so long as the desired 
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functional domain of the protein is expressed. The distinguishing features of ADC 
polypeptides, including the AP2 domain, are discussed in detail below. 

Modified protein chains can also be readily designed utilizing various 
recombinant DNA techniques well known to those skilled in the art and described in 
detail, below. For example, the chains can vary from the naturally occurring sequence at 
the primary structure level by amino acid substitutions, additions, deletions, and the like. 
These modifications can be used in a number of combinations to produce the final 
modified protein chain. 

Preparation of recombinant vectors 

To use isolated sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 
transforming a wide variety of higher plant species are well known and described in the 
technical and scientific literature. See, for example, Weising et al. Ann. Rev. Genet. 
22:421-477 (1988). A DNA sequence coding for the desired polypeptide, for example a 
cDNA sequence encoding a full length protein, will preferably be combined with 
transcriptional and translational initiation regulatory sequences which will direct the 
transcription of the sequence from the gene in the intended tissues of the transformed 
plant. 

For example, for overexpression, a plant promoter fragment may be 
employed which will direct expression of the gene in all tissues of a regenerated plant. 
Such promoters are referred to herein as "constitutive" promoters and are active under 
most environmental conditions and states of development or cell differentiation. 
Examples of constitutive promoters include the cauliflower mosaic virus (CaMV) 35S 
transcription initiation region, the 1'- or 2'- promoter derived from T-DNA of 
Agrobacterium tumqfaciens, and other transcription initiation regions from various plant 
genes known to those of skill. Such genes include for example, the AP2 gene, ACT11 
from Arabidopsis (Huang et al. Plant Mol. Biol. 33:125-139 (1996)), Cat3 from 
Arabidopsis (GenBank No. U43147, Zhong et al, Mol. Gen. Genet. 251:196-203 
(19%)), the gene encoding stearoyl-acyr carrier protein desaturase from Brassica napus 
(Genbank No. X74782, Solocombe et al. Plant Physiol. 104:1167-1176 (1994)), GPcl 
from maize (GenBank No. X15596, Martinez et al. J. Mol. Biol 208:551-565 (1989)), 

/*vtvtnfini|i| l'I'l.i DYTW ■»*•■* /"T4TTW 1? 



WO 99/41974 PCT/US99/03429 

25 

and Gpc2 from maize (GenBank No. U45855, Manjunath et al, Plant Mol. Biol. 33:97- 
112 (1997)). 

Alternatively, the plant promoter may direct expression of the ADC nucleic 
acid in a specific tissue or may be otherwise under more precise environmental or 
developmental control. Examples of environmental conditions that may effect 
transcription by inducible promoters include anaerobic conditions, elevated temperature, 
or the presence of light. Such promoters are referred to here as "inducible" or "tissue- 
specific" promoters. One of skill will recognize that a tissue-specific promoter may 
drive expression of operably linked sequences in tissues other than the target tissue. 
Thus, as used herein a tissue-specific promoter is one that drives expression 
preferentially in the target tissue, but may also lead to some expression in other tissues 
as well. 

Examples of promoters under developmental control include promoters that 
initiate transcription only (or primarily only) in certain tissues, such as fruit, seeds, or 
flowers. Promoters that direct expression of nucleic acids in ovules, flowers or seeds are 
particularly useful in the present invention. As used herein a seed-specific promoter is 
one which directs expression in seed tissues, such promoters may be, for example, ovule- 
specific, embryo-specific, endosperm-specific, integument-specific, seed coat-specific, or 
some combination thereof. Examples include a promoter from the ovule-specific BEL1 
gene described in Reiser et al. Cell 83:735-742 (1995) (GenBank No. U39944). Other 
suitable seed specific promoters are derived from the following genes: MAC1 from maize 
(Sheridan et al Genetics 142:1009-1020 (1996), Cat3 from maize (GenBank No. 
L05934, Abler et al Plant Mol Biol 22:10131-1038 (1993), the gene encoding oleosin 
18kD from maize (GenBank No. J05212, Lee et al. Plant Mol Biol 26:1981-1987 
(1994)), vivparous-1 from Arabidopsis (Genbank No. U93215), the gene encoding 
oleosin from Arabidopsis (Genbank No. Z17657), Atmycl from Arabidopsis (Urao et 
alPlant Mol Biol 32:571-576 (1996), the 2s seed storage protein gene family from 
Arabidopsis (Conceicao et al. Plant 5:493-505 (1994)) the gene encoding oleosin 20kD 
from Brassica napus (GenBank No. M63985), napA from Brassica napus (GenBank No. 
J02798, Josefsson et al JBL 26:12196-1301 (1987), the napin gene family from Brassica 
napus (Sjodahl et al. Planta 197:264-271 (1995), the gene encoding the 2S storage 
protein from Brassica napus (Dasgupta et al. Gene 133:301-302 (1993)), the genes 
encoding oleosin A (Genbank No. U09118) and oleosin B (Genbank No. U09119) from 
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soybean and the gene encoding low molecular weight sulphur rich protein from soybean 
(Choi et al. Mol Gen, Genet. 246:266-268 (1995)). 

If proper polypeptide expression is desired, a polyadenylation region at the 
3 '-end of the coding region should be included. The polyadenylation region can be 
derived from the natural gene, from a variety of other plant genes, or from T-DNA. 

The vector comprising the sequences (e.g., promoters or coding regions) 
from genes of the invention will typically comprise a marker gene which confers a 
selectable phenotype on plant cells. For example, the marker may encode biocide 
resistance, particularly antibiotic resistance, such as resistance to kanamycin, G418, 
bleomycin, hygromycin, or herbicide resistance, such as resistance to chlorosulfuron or 
Basta. 



Production of transgenic plants 

DNA constructs of the invention may be introduced into the genome of the 
desired plant host by a variety of conventional techniques. For example, the DNA 
construct may be introduced directly into the genomic DNA of the plant cell using 
techniques such as electroporation and microinjection of plant cell protoplasts, or the 
DNA constructs can be introduced directly to plant tissue using ballistic methods, such as 
DNA particle bombardment. 

Microinjection techniques are known in the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al. Embo J. 3:2717-2722 (1984). 
Electroporation techniques are described in Fromm et al. Proc. Natl. Acad. Sci. USA 
82:5824 (1985). Ballistic transformation techniques are described in Klein et al. Nature 
327:70-73 (1987). 

Alternatively, the DNA constructs may be combined with suitable T-DNA 
flanking regions and introduced into a conventional Agrobacterium tumefaciens host 
vector. The virulence functions of the Agrobacterium tumefaciens host will direct the 
insertion of the construct and adjacent marker into the plant cell DNA when the cell is 
infected by the bacteria. Agrobacterium tumefaciens-mediated transformation techniques, 
including disarming and use of binary vectors, are well described in the scientific 
literature. See, for example Horsch et al. Science 233:496-498 (1984), and Fraley et al. 
Proc. Natl. Acad. Sci. USA 80:4803 (1983). 
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Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses 
the transformed genotype and thus the desired phenotype such as increased seed mass. 
Such regeneration techniques rely on manipulation of certain phytohormones in a tissue 
culture growth medium, typically relying on a biocide and/or herbicide marker which has 
been introduced together with the desired nucleotide sequences. Plant regeneration from 
cultured protoplasts is described in Evans et al., Protoplasts Isolation and Culture, 
Handbook of Plant Cell Culture, pp. 124-176, MacMillilan Publishing Company, New 
York, 1983; and Binding, Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC 
Press, Boca Raton, 1985. Regeneration can also be obtained from plant callus, explants, 
organs, or parts thereof. Such regeneration techniques are described generally in Klee et 
al. Ann. Rev. of Plant Phys. 38:467-486 (1987). 

The nucleic acids of the invention can be used to confer desired traits on 
essentially any plant. Thus, the invention has use over a broad range of plants, including 
species from the genera Anacardium, Arachis, Asparagus, Atropa, Avena, Brassica, 
Citrus, Citrullus, Capsicum, Carthamus, Cocos, Coffea, Cucumis, Cucurbita, Daucus, 
Elaeis, Fragaria, Glycine, Gossypium, Helianthus, Heterocallis, Hordeum, Hyoscyamus, 
Lactuca, Linum, Lolium, Lupinus, Lycopersicon, Malus, Manihot, Majorana, Medicago, 
Nicotiana, Olea, Oryza, Panieum, Pannesetum, Persea, Phaseolus, Pistachia, Pisum, 
Pyrus, Primus, Raphanus, Ricinus, Secale, Senecio, Sinapis, Solanum, Sorghum, 
Theobromus, Trigonella, Triticum, Vicia, Vitis, Vigna, and Zea. 

Increasing seed size, protein, amino acid, and oils content is particularly 
desirable in crop plants in which seed are used directly for animal or human consumption 
or for industrial purposes. Examples include soybean, canola, and grains such as rice, 
wheat, corn, rye, and the like. Decreasing seed size, or producing seedless varieties, is 
particularly important in plants grown for their fruit and in which large seeds may be 
undesirable. Examples include cucumbers, tomatoes, melons, and cherries. 

One of skill will recognize that after the expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into 
other plants by sexual crossing. Any of a number of standard breeding techniques can be 
used, depending upon the species to be crossed. 

Since transgenic expression of the nucleic acids of the invention leads to 
phenotypic changes in seeds and fruit, plants comprising the expression cassettes 
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discussed above must be sexually crossed with a second plant to obtain the final product. 
The seed of the invention can be derived from a cross between two transgenic plants of 
the invention, or a cross between a plant of the invention and another plant. The desired 
effects (e.g., increased seed mass) are generally enhanced when both parental plants 
contain expression cassettes of the invention. 

Seed obtained from plants of the present invention can be analyzed 
according to well known procedures to identify seed with the desired trait. Increased or 
decreased size can be determined by weighing seeds or by visual inspection. Protein 
content is conveniently measured by the method of Bradford et al. Anal. Bioch. 72:248 
(1976). Oil content is determined using standard procedures such as gas 
chromatography. These procedures can also be used to determine whether the types of 
fatty acids and other lipids are altered in the plants of the invention. 

Using these procedures one of skill can identify the seed of the invention 
by the presence of the expression cassettes of the invention and increased seed mass. 
Usually, the seed mass will be at least about 10%, often about 20% greater than the 
average seed mass of plants of the same variety that lack the expression cassette. The 
mass can be about 50% greater and preferably at least about 75% to about 100% greater. 
Increases in other properties e.g. , protein and oil will usually be proportional to the 
increases in mass. Thus, in some embodiments protein or oil content can increase by 
about 10%, 20%, 50%, 75% or 100%, or in approximate proportion to the increase in 
mass. 

Alternatively, seed of the invention in which AP2 expression is enhanced 
will have the expression cassettes of the invention and decreased seed mass. Seed mass 
will be at least about 20% less than the average seed mass of plants of the same variety 
that lack the expression cassette. Often the mass will be about 50% less and preferably 
at least about 75% less or the seed will be absent. As above, decreases in other 
properties e.g., protein and oil will be proportional to the decreases in mass. 

The following Examples are offered by way of illustration, not limitation. 

Example 1 
AP2 Gene Isolation 
The isolation and characterization of an AP2 gene from Arabidopsis is 
described in detail in Jofuku et al., supra. Briefly, T-DNA from Agrobacterium was 
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used as an insertional mutagen to identify and isolate genes controlling flower formation 
in Arabidopsis. One transformed line, designated T10, segregated 3 to 1 for a flower 
mutant that phenotypieally resembled many allelic forms of the floral homeotic mutant 
ap2. T10 was tested and it was confirmed genetically that T10 and ap2 are allelic. The 

5 mutant was designated as ap2-10. 

It was determined that ap2-10 was the product of a T-DNA insertion 
mutation by genetic linkage analysis using the T-DNA-encoded neomycin 
phosphotransferase II (NPTII) gene as a genetic marker. An overlapping set of 
T-DNA-containing recombinant phage was selected from an ap2-10 genome library and 

10 the plant DNA sequences flanking the T-DNA insertion element were used as 

hybridization probes to isolate phage containing the corresponding region from a 
wild-type Arabidopsis genome library. The site of T-DNA insertion in ap2-10 was 
mapped to a 7.2-kb EcoRl fragment centrally located within the AP2 gene region. 

Five Arabidopsis flower cDNA clones corresponding to sequences within 

15 the 7.2-kb AP2 gene region were isolated. All five cloned cDNAs were confirmed to 
represent AP2 gene transcripts using an antisense gene strategy to induce ap2 mutant 
flowers in wild-type plants. 

To determine AP2 gene structure, the nucleotide sequences of the cDNA 
inserts were compared to that of the 7.2-kb AP2 genomic fragment. These results 

20 showed that the AP2 gene is 2.5 kb in length and contains 10 exons and 9 introns that 
range from 85 to 110 bp in length. The AP2 gene encodes a theoretical polypeptide of 
432 amino acids with a predicted molecular mass of 48 kD. The AP2 nucleotide and 
predicted protein sequences were compared with a merged, nonredundant data base. It 
was found that AP2 had no significant global similarity to any known regulatory protein. 

25 Sequence analysis, however, did reveal the presence of several sequence 

features that may be important for AP2 protein structure or function. First, AP2 contains 
a 37-amino acid serine-rich acidic domain (amino acids 14 to 50) that is analogous to 
regions that function as activation domains in a number of RNA polymerase II 
transcription factors. Second, AP2 has a highly basic 10-amino acid domain (amino 

30 acids 119 to 128) that includes a putative nuclear localization sequence KKSR suggesting 
that AP2 may function in the nucleus. Finally, that the central core of the AP2 
polypeptide (amino acids 129 to 288) contains two copies of a 68-amino acid direct 
repeat that is referred to here as the AP2 domain. The two copies of this repeat, 
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designated AP2-R1 and AP2-R2, share 53% amino acid identity and 69% amino acid 
homology. Figure 1A shows that each API repeat contains an 18-amino acid conserved 
core region that shares 83% amino acid homology. Figure IB shows that both copies of 
this core region are theoretically capable of forming amphipathic a-helical structures that 
5 may participate in protein-protein interactions. SEQ ID NO: 3 is the full length AP2 
genomic sequence. 

Example 2 
Preparation of API Constructs 

10 Gene constructs were made comprising the AP2 gene coding region 

described above in a transcriptional fusion with the cauliflower mosaic virus 35S 
constitutive promoter in both the sense and antisense orientations. The original vector 
containing the 35S promoter pGSJ780A was obtained from Plant Genetic Systems (Gent, 
Belgium). The pGSJ780A vector was modified by inserting a Clal-BamHl adaptor 

15 containing an EcoRl site in the unique BamHl site of pGSJ780A. The modified 

pGSJ780A DNA was linearized with EcoRl and the AP2 gene coding region inserted as 
a L68 kb EcoRl fragment in both sense and antisense orientations with respect to the 
35S promoter (see, Figures 2 and 3). 

The resultant DNA was transformed into E. coli and spectinomycin 

20 resistant transformants were selected. Plasmid DNAs were isolated from individual 

transformants and the orientation of the insert DNAs relative to the 35S promoter were 
confirmed by DNA sequencing. Bacterial cells containing the 35S/AP2 sense (designated 
pPW12.4 and pPW9) and 35S/AP2 antisense (designated pPW14.4 and pPW15) 
constructs were conjugated to Agrobacterium tumefaciens and rifampicin, spectinomycin 

25 resistant transformants were selected for use in Agrobacterium-mcdmted plant 
transformation experiments. 

The 35S/AP2 sense and 35S/AP2 antisense constructs were introduced into 
wild-type Arabidopsis and tobacco plants according to standard techniques. Stable 
transgenic plant lines were selected using the plant selectable marker NPTII (which 

30 confers resistance to the antibiotic kanamycin) present on the modified Ti plasmid vector 
pGSJ780A. 
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Example 3 

Modification of Seed using AP2 Sequences 
This example shows that ap2 mutant plants and transgenic plants 
containing the 35S/AP2 antisense construct produced seed with increased mass and total 

5 protein content. By contrast, transgenic plants containing the 35S/AP2 sense construct 
produced seed with decreased mass and protein content. Together these results indicate 
that seed mass and seed contents in transgenic plants can be modified by genetically 
altering AP2 activity. 

Seed from 30 lines were analyzed for altered seed size and seed protein 

10 content including the Arabidopsis ap2 mutants ap2-l, ap2-3 9 ap2-4, ap2-5 9 ap2-6, ap2-9 
and ap2-10 and transgenic Arabidopsis and transgenic tobacco containing the CaMV 
35S/AP2 antisense gene construct, the CaMV 35S/AP2 sense gene construct, or the 
pGSJ780A vector as described above. The ap2 mutants used in this study are described 
in Komaki et al, Development 104, 195-203 (1988), Kunst et aL, Plant Cell i, 

15 1195-1208 (1989), Bowman et al. 9 Development 112, 1-20 (1991), and Jofuku et aL, 
supra. 

Due to the small size of Arabidopsis and tobacco seed, average seed mass 
was determined by weighing seed in batches of 100 for Arabidopsis and 50 seed for 
tobacco. The net change in seed mass due to changes in AP2 gene activity was 
20 calculated by subtracting the average mass of wild-type seed from mutant seed mass. 

Seed from three wild-type Arabidopsis ecotypes C24, Landsberg-er ,and 
Columbia, and one wild-type tobacco SRI were used as controls. Wild-type Arabidopsis 
seed display seasonal variations in seed mass which range from 1.6-2.3 mg per 100 seed 
as shown in Table L Therefore transgenic Arabidopsis seed were compared to control 
25 seed that had been harvested at approximately the same time of season. This proved to 
be an important for comparing the effects of weak ap2 mutations on seed mass. 

Table I shows that all ap2 mutant seed examined, ap2-l, ap2-3,ap2-4, 
ap2-5 9 ap2-6, ap2-9, and ap2-10, show a significant increase in average seed mass 
ranging from +27 to +104 percent compared to wild-type. The weak partial loss-of- 
30 function mutants such as ap2-l and ap 1-3 show the smallest gain in average seed mass 
ranging from +27 percent to +40 percent of wild-type, respectively. By contrast, 
strong ap2 mutants such as ap2-6 and ap2-10 show the largest gain in seed mass ranging 
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from +69 percent to +104 percent of wild-type, respectively. Thus reducing API gene 
activity genetically consistently increases Arabidopsis seed mass. 

AP2 antisense and AP2 sense cosuppression strategies described above 
were used to reduce AP2 gene activity in planta to determine whether seed mass could be 
manipulated in transgenic wild-type plants. Twenty-nine independent lines of transgenic 
Arabidopsis containing the CaMV 35S/AP2 antisense gene constructs pPW14.4 and 
pPW15 (Figure 2) were generated. Each transgenic line used in this study tested positive 
for kanamycin resistance and the presence of one or more copies of T-DNA. 

Table I shows that seed from nine transgenic Arabidopsis AP2 antisense 
lines show a significant increase in seed mass when compared to control seed ranging 
from +22 percent for line C24 15-542 to +89 percent for line C24 15-566. Both C24 
and Landsberg-er ecotypes were used successfully. Increased seed mass was observed in 
Fl, F2, and F3 generation seed. 

Eight lines containing the 35S/AP2 sense gene construct were generated 
which were phenotypically cosuppression mutants. As shown in Table I seed from two 
cosuppression lines examined showed larger seed that range from +26 percent to +86 
percent. By contrast, plants transformed with the vector pGSJ780A showed a normal 
range of average seed mass ranging from -0.5 percent to +13 percent compared to wild- 
type seed (Table I). Together, these results demonstrate that AP2 gene sequences can be 
used to produce a significant increase in Arabidopsis seed mass using both antisense and 
cosuppression strategies in a flowering plant. 
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Table I. Genetic control of Arabidopsis seed mass by AP2. 

Average seed mass in 

mg per 100 seed 1 - 2 Percent change in seed mass 

compared to wild-type 



ap2 mutant seed 


2.1 (0.1) 


+27% 


1. ap2-l 


2.2 (0.1) 


+33% 




2.1 (0.2) 


+31% 




2.8 (0.2) 


+33% 


2. ap2-3 


2.6 (0.1) 


+27% 




3.5 (0.3) 


+69% 




3.5 (0.2) 


+69% 


4. ap2-5 


2.9 (0.1) 


+39% 


5. ap2-6 


3.5 (0.2) 


+69% 


6. ap2-9 


2.9(0.1) 


+40% 


7. ap2-10 


3.7 (0.4) 


+79% 




3.9 (0.3) 


+90% 




4.2 (0.5) 


+ 104% 


Seed produced by transgenic CaMV35S/AP2 antisense lines (from a Km resistant mother) 




1.C24 14.4E (Fl-15)F2sd 




+35% 


C24 14.4E (Fl-15) F3 sd 


3.1 


+47% 


2. C24 14.4S (Fl-1) 


3.4 (0.3) 


+29% 


3. C24 14.4AA (Fl-24) 


2.8 (0.2) 


+30% 


4. C24 14.4DD (Fl-2) 


2.9 (0.1) 


+30% 


5. C24 15-522 


2.8 (0.3) 


+76% 


6. C24 15-542 (Fl-2) 


3.6(0.1) 


+25% 


C24 15-542 (Fl-7) 


2.6 (0.1) 


+22% 


7. C24 15-566 


2.5 (0.2) 


+89% 


8. LE 15-9992-3 (Fl-1) F2 sd 


3.9 (0.1) 


+42% 


9. LE 15-83192-3 (Fl-3) 


2.4 (0.1) 


+33% 


LE 15-83192-3 (Fl-17) 


2.8 (0.0) 


+28% 




2.7 (0.0) 




Seed produced by transgenic CaMV35S/AP2 cosuppression lines (from a Km resistant mother) 


1.C24 9-5 (Fl-5) 


3.8 (0.0) 


+86% 


2. LE 9-83192-2 (Fl-19) 


2.7 (0.2) 


+26% 


LE 9-83192-2 (Fl-24) 


2.7 (0.1) 


+26% 
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Average seed mass in 

mg per 100 seed 1 ' 2 Percent change in seed mass 

compared to wild-type 



Seed produced by transgenic pGSJ780A vector only lines (from a Km resistant mother plant) 



1.C24 3 -107 (Fl-1) 


2.2 (0.1) 


+9% 


2. C24 3-109 (Fl-1) 


2.3 (0.0) 


+ 13% 


3. LE 3-83192-1 (Fl-2) 


2.3 (0.1) 


+7% 


4. LE 3-83192-3 (Fl-2) 


2.4 (0.1) 


+ 11% 


5. LE 3-9992-4 (Fl-4) 




+ 12% 


LE 3-9992^ (Fl-6) 


2.3 (0.0) 


j.aqt 


LE 3-9992-4 (Fl-8) 


2.1 (0.0) 




6. LE 3-9992-9 (Fl-3) 


2.3 (0.1) 


+7% 


Seed produced by wild-type Arabidopsis plants 






1. C24 


2.0 (0.1) 






2.3 (0.1) 






2.2 




2. Landsberg-er 


1.6 (0.1) 






2.1 (0.1) 






2.1 






2.3 (0.1) 




3. Columbia 


1.8 (0.1) 





2.1 (0.1) 



1 Standard deviation values are given in parentheses. 

2 Wild-type seed values used for this comparison were chosen by ecotype and harvest date. 

Arabidopsis AP2 gene sequences were also used to negatively control seed 
mass in tobacco, a heterologous plant species. Table II shows that in five transgenic 
tobacco lines the CaMV 35S/AP2 overexpression gene construct was effective in 
reducing transgenic seed mass from -27 percent to -38 percent compared to wild-type 
seed. These results demonstrate the evolutionary conservation of AP2 gene function at 
the protein level for controlling seed mass in a heterologous system. 
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Table H. Genetic control of tobacco seed mass using Arabidopsis AP2. 



Average seed mass in 
mg per 5 seed 1 



Percent change in seed mass 
compared to wild-type 



Seed produced by transgenic CaMV 35S/AP2 sense gene lines (from a Km resistant mother) 



1. SRI 9-110 To 
SRI 9-110 (Fl-5) 

2. SRI 9-202 (Fl-G) 
SRI 9-202 (Fl-I) 

3. SRI 9-103 (Fl-2) 

4. SRI 9-413-1 

5. SRI 9-418-1 To 

Seed produced by transgenic CaMV 35S/AP2 

1. SRI 15-111 
SRI 15-111 (Fl) 

2. SRI 15-116 To 
SRI 15-116 (Fl-2) 
SRI 15-116 (Fl-1) 

3. SRI 15407 (Fl) 

4. SRI 15-102 (Fl) 

5. SRI 15413 (Fl-3) 

6. SRI 15-410 (Fl-2) 



3.1 (0.0) 

3.0 (0.2) 

2.8 (0.3) 

3.1 (0.2) 

3.2 (0.1) 

3.9 (0.0) 
2.8 (0.0) 
3.0 (0.2) 
3.5 (0.1) 



-27% 

-29% 

-34% 

-27% 

-24% 

-8% 

-34% 

-29% 

-18% 



antisense gene lines (from a Km resistant mother) 



5.1 (0.4) 

5.0 (0.4) 

4.1 (0.4) 
4.0 (0.1) 
4.5 (0-1) 
4.8 (0.5) 
4.7 (0.3) 
4.5 (0.2) 

4.2 (0.0) 
4.4 (0.0) 



7. SRI 15-210 (F14) 3.6 <P») 

Seed produced by pGSJ780A vector only lines (from a Km resistant mother) 



1. SRI 3-402 (Fl) 

2. SRI 3-401 (Fl) 

3. SRI 3-405 (Fl) 

Seed from wild-type tobacco 
1.SR1 



5.0 (0.1) 
4.6 (0.1) 
4.4 <p.l) 



4.2 (0.3) 
4.0 (0.1) 



+20% 
+ 19% 
-3% 
-5% 
+5% 
+ 10% 
+ 10% 
+6% 
+0% 
+4% 

-15% 



+ 17% 
+8% 
+4% 



1 S tanda rd deviation values are given in parentheses. 

Use of AP2 gene constructs to control seed protein content 

Total seed protein was extracted and quantitated from seed produced by 
wild-type, ap2 mutant, transgenic AP2 antisense, and transgenic AP2 sense ^suppression 
plants according to Naito et al Plant Mol Biol 11, 109-123 (1988). Seed protein was 
extracted in triplicate from batches of 100 dried seed for Arabidopsis or 50 dried seed 
for tobacco. Total protein yield was determined by the Bradford dye-binding procedure 
as described by Bradford, Anal. Biochem. 72:248 (1976). The results of this analysis are 
shown in Table HI. 
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ap2 mutant total seed protein content increased by 20 percent to 78 percent 
compared to wild-type control seed. Total seed protein from transgenic AP2 antisense 
plants increased by +31 percent to +97 percent compared to wild-type controls. 
Transgenic API cosuppression seed showed a +13 and +17 percent increase over wild- 
type. Together, the transgenic antisense and cosuppression mutant seed consistently 
yielded more protein per seed than did the wild-type controls or transgenic plants 
containing the pGSJ780A vector only (Table IH). 

Table HI. Genetic control of total seed protein content in Arabidopsis using AP2. 

Total see protein in Ig Percent change in protein content 

per 100 seed 1 compared 

to wild-type 

ap2 mutant seed 

15 i ap 2 -l 652 (17) +20% relative to WT seed 

' 615 (30) +»* 

2. ap2-3 705 (47) +27% 

3. ^2-4 729 < 107 > + *I 
4.3,2-5 617(24) +13% 

20 5.^2-6 836(14) +52% 

6.ap2-9 798(11) +46% 

iZl-10 8 36 d5) + 78 * 



10 



25 



Transgenic CaMV 35S/AP2 antisense see mass (from Km resistant mother) 

1. C24 14.4E (Fl-1) F3 sd 615 (60) +31% 

2. C24 15-522 (Fl-1) 790 (23) +68% 

3. C24 15-566 925 (173) +97% 



30 Transgenic CaMV 35S/AP2 sense cosuppression seed mass (from Km Resistant mother plant) 



1. LE 9-83192-2 (Fl-19) 
LE 9-83192-2 (Fl-24) 



616 +13% 
637 +17% 



35 Wild-type seed 

1. C24 469(19) 

2. LE 545 (22) 

555 

3. Col 548(42) 



40 



1 Standard deviation values are given in parentheses. 



Transgenic tobacco containing the 35S/AP2 sense gene construct show that 
45 AP2 overexpression can decrease seed protein content by 27 to 45 percent compared to 
wild-type seed. Together, the transgenic Arabidopsis and tobacco results demonstrate 
that seed mass and seed protein production can be controlled by regulating AP2 gene 
activity. 
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Table IV. Negative control of transgenic tobacco se 


ed protein content by Arabidopsis AP2 gene expression. 1 




Ave. protein 
per 50 seed 


Percent change in protein content 
compared 
to wild-type 


Seed produced by transgenic CAMV 35S/AP2 sen 

1. SRI 9-110 

1. SRI 9-202 (Fl-G) 

3. SRI 9-413 

4. SRI 9-418-1 


se gene plant 

242 (11) 
271 (11) 
362 (8) 
319 (16) 


-45% 
-38% 
-18% 
-27% 


Wild-type Control 
SRI (wild-type) 


440 (8) (10) 


NA 



1 Standard deviation values are given in parentheses. 

Analysis of transgenic seed proteins by gel electrophoresis 
Arabidopsis seed produce two major classes of seed storage proteins, the 
12S cruciferins and 2S napins which are structurally related to the major storage proteins 
found in the Brassicaceae and in the Leguminoceae. The composition of seed proteins in 
wild-type, ap2 mutant, and transgenic Arabidopsis seed were compared by SDS 
polyacrylamide gel electrophoresis as described by Naito et al., Plant Mol Biol. 11, 
109-123 (1988). Total seed proteins were extracted as described above. 50 Ig aliquots 
were separated by gel electrophoresis and stained using Coomassie brilliant blue. These 
results showed that the spectrum of proteins in wild-type and ap2 mutant seed are 
qualitatively indistinguishable. There is no detectable difference in the representation of 
the 12S or 2S storage proteins between the wild-type and ap2 mutant seed extracts. This 
shows that reducing AP2 gene activity genetically does not alter the profile of storage 
proteins synthesized during seed maturation. The spectrum of seed proteins produced in 
transgenic AP2 antisense and AP2 sense cosuppression seed are also indistinguishable 
from wild-type. In particular, there is no detectable difference in the representation of 
the 12S cruciferin or 2S napin storage proteins in the larger seed. 

Finally, the transgenic tobacco plants con tainin g the 35S/AP2 
overexpression gene construct produced significantly smaller seed. Despite the decrease 
in seed mass in transgenic tobacco there was no detectable difference in storage protein 
profiles between seed from 35S/AP2 transformants and wild-type SRI. 
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Example 4 

Isolation of other members of the AP2 gene family from Arabidopsis 

This example describes isolation of a number of AP2 nucleic acids from 
Arabidopsis. The nucleic acids are referred to here as RAP2 (related to API) were 
identified using primers specific to nucleic acid sequences from the AP2 domain 
described above. 

MATERIALS AND METHODS 

Plant Material. Arabidopsis thaliana ecotype Landsberg erecta (L-er) and 
C24 were used as wild type. Plants were grown at 22°C under a 16-hr light/8-hr dark 
photoperiod in a 1:1:1 mixture containing vermiculite/perlite/peat moss. Plants were 
watered with a one-fourth strength Peter's solution (Grace-Sierra, Milpitas, CA). Root 
tissue was harvested from plants grown hydroponically in sterile flasks containing lx 
Murashige and Skoog plant salts (GIBCO), 1 mg/liter thiamine, 0.5 mg/liter pyridoxine, 
0.5 mg/liter nicotinic acid, 0.5 g/liter 2-(N-morpholino)ethanesulfonic acid (MES), and 
3% sucrose, with moderate shaking and 70 imol -m^-sec" 1 of light. 

Analysis of cloned Arabidopsis cDNAs. Arabidopsis expressed sequence 
tagged (EST) CDNA clones representing RAP2.1 and RAP2.9 were generated as 
described by Cooke et at. (Cooke, R., et al, 1996, Plant J. 9, 101-124). EST cDNA 
clones representing RAP2.2 and RAP2.8 were generated as described by Hofte et al 
(Hofte, H., et al, 1993, Plant J. 4, 1051-1061). EST cDNA clones representing all 
other RAP2 genes were generated by Newman et al. (Newman, T., et al, 1994, Plant 
Physiol 106,1241-1255) and provided by the Arabidopsis Biological Resource Center 
(Ohio State University). Plasmid DNAs were isolated and purified by anion exchange 
chromatography (Qiagen, Chatsworth, CA). DNA sequences were generated using 
fluorescence dye-based nucleotide terminators and analyzed as specified by the 
manufacturer (Applied Biosy stems). 

Nucleotide and Amino Acid Sequence Comparisons. The TBLASTN 
program (Altschul, S. F., et al, 1990, J. Mol Biol 215, 403-410) and default 
parameter settings were used to search the Arabidopsis EST database (AAtDB 4-7) for 
genes that encode AP2 domain-containing proteins. Amino acid sequence alignments 
were generated using the CLUSTAL W multiple sequence alignment program 
(Thompson, J. D., et al., 1994, Nucleic Acids Res. 22, 4673-4680). Secondary structure 
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predictions were based on the principles and software programs described by Rost (Rost, 
B., 1996, Methods Enzymol. 266, 525-539) and Rost and Sander (Rost, B., et al, 1993, 
J. Mol. Biol. 232, 589-599; Rost, B., et al., 1994, Proteins 19, 55-77). 

RAP2 Gene-Specific Probes. RAP2 gene-specific fragments were 
generated by PCR using gene-specific primers and individual RAP2 plasmid DNAs as a 
template as specified by Perkin-Elmer (Roche Molecular Systems, Branchburg, NJ). The 
following primers were used to generate fragments representing each RAP2 gene: 
RAP2.1, 5'-AAGAGGACCATCTCTCAG-3 ' , 
5'-AACACTCGCTAGCTTCTC-3' ; 

RAP2.2, 5'-TGGTTCAGCAGCCAACAC-3 ' , 5'-CAATGCATAGAGCTTGAGG-3 ' ; 
RAP2.3, 5'-TCATCGCCACGATCAACC-3\ 5 ' -AGC AGTCC A ATGCGACGG-3 ' ; 
RAP2.4, 5'-ACGGATTTCACATCGGAG-3', 5'-CTAAGCTAGAATCGAATCC-3' ; 
RAP2.7, 5 '-CGATGGAGACGAAGACTC-3 , 5 ' -GTCGG A ACCGGAGTTACC-3 ' ; 
RAP2.8, 5'-TCACTCAAAGGCCGAGATC-3\ 5'-TAACAACATCACCGGCTCG-3'; 
RAP2.9, 5'-GTGAAGGCTTAGGAGGAG-3 ' , 5 '-TGCCTCATATGAGTC AGAG-3 ' . 
PCR-synthesized DNA fragments were gel purified and radioactively labeled using 
random oligonucleotides (Amersham) for use as probes in gene mapping and RNA gel 
blot experiments. 

Gene Mapping Experiments. RAP2 genes were placed on the 
Arabidopsis genetic map by either restriction fragment length polymorphism segregation 
analysis using recombinant inbred lines as described by Reiter et al. (Reiter, R. S., et 
al, 1992, Proc. Natl. Acad. Sci. USA 89, 1477-1481) or by matrix-based analysis of 
pooled DNAs from the Arabidopsis yUP or CIC yeast artificial chromosome (YAC) 
genomic libraries (Ecker, J. R., 1990, Methods 1, 186-194; Creusot, F., et al., 1995, 
Plant J. 8, 763-770) using the PCR (Green, E. D., et al, 1990, Proc. Natl Acad Sci. 
USA 87, 1213-1217; Kwiatkowski, T. J., et al, 1990, Nucleic Acids Res. 18, 
7191-7192). Matrix based mapping results were confirmed by PCR using DNA from 

individual YAC clones. 

mRNA Isolation. Polysomal poly(A) mRNAs from Arabidopsis flower, 
rosette leaf, inflorescence stem internoae, and hydroponically-grown roots were isolated 
according to Cox and Goldberg (Cox, K. H., et al, 1988, in Plant Molecular Biology. A 
Practical Approach, ed. Shaw, C. H. (IRL, Oxford), pp. 1-35). 
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RNA Gel Blot Studies. RNA gel blot hybridizations were carried out as 
specified by the manufacturer (Amersham). mRNA sizes were estimated relative to 
known RNA standards (BRL). AP2 transcripts were detected using a labeled DNA 
fragment representing nucleotides 1-1371 of the AP2 cDNA plasmid clone pAP2cl 
5 (Jofuku, K. D., et al, 1994, Plant Cell 6, 1211-1225). 

RESULTS 

The AP2 Domain Defines a Large Family of Plant Proteins. Using the 
AP2 domain as a sequence probe 34 cDNA clones were identified that encode putative 

10 RAP2 proteins in the Arabidopsis EST database (Materials and Methods). Several of 

these partial sequences have been reported previously (Ohme-Takagi, et al, 1995, Plant 
Cell 7, 173-182; Elliot, R. C, et al., 1996, Plant Cell 8, 155-168; Klucher, KM., et 
al., 1996, Plant Cell 8, 137-153; Wilson, K., et al., 1996, Plant Cell 8, 659-671; Ecker, 
J. R., 1995, Science 268, 667-675; Weigel, D., 1995, Plant Cell 7, 388-389). Based on 

15 nucleotide sequence comparison, it was inferred that approximately half of the 34 RAP2 
cDNA sequences were likely to represent redundant clones. Therefore, a complete DNA 
sequence for 17 putative RAP2 cDNA clones that appeared to represent unique genes and 
which contained the largest cDNA inserts was selected and generated. It was determined 
from the predicted amino acid sequences of these clones that the Arabidopsis RAP2 ESTs 

20 represent a niinimum of 12 genes that are designated RAP2.1-RAP2. 12. As shown in 
Table V, preliniinary gene mapping experiments using restriction fragment length 
polymorphism analysis and PCR-based screening of the Arabidopsis yUP and CIC yeast 
artificial chromosome libraries (Materials and Methods) revealed that at least 7 members 
of the RAP2 gene family are distributed over 4 different chromosomes. In addition, 

25 several family members are tightly linked in the genome. For example, RAP2.10 is only 
10 kb away from AP2, which is also closely linked to ANT on chromosome 4 (Elliot, R. 
C, et al., 1996, Plant Cell 8, 155-168; Klucher, K M., et al., 1996, Plant Cell 8, 
137-153). 

Sequence analysis also revealed that the proteins encoded by the RAP2 
30 genes are all characterized by the presence of least one AP2 domain. Fig. 4 shows a 
sequence comparison of 21 AP2 domains from 19 different polypeptides including 
RAP2.1-RAP2.12, AP2, ANT, TINY, and the tobacco EREBPs. From this comparison, 
it was determined that there are 2 conserved sequence blocks within each AP2 domain. 
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The first block, referred to as the YRG element, consists of 19-22 amino acids, is highly 
basic and contains the conserved YRG amino acid motif (Fig. 4 A and B). The second 
block, referred to as the RAYD element, is 42-43 amino acids in length and contains a 
highly conserved 18-amino acid core region that is predicted to form an amphipathic 

5 a-helix in the AP2 domains of AP2, ANT, TINY, and the EREBPs. In addition, there 
are several invariant amino acid residues within the YRG and RAYD elements that may 
also play a critical role in the structure or function of these proteins. For example, the 
glycine residue at position 40 within the RAYD element is invariant in all AP2 domain 
containing proteins (Fig. 4 A and B) and has been shown to be important for AP2 

10 function (Jofuku, K. D., et al., 1994, Plant Cell 6, 1211-1225). 



Table V. Arabidopsis RAP2 



RAP2 gene 

15 containing YAC Chromosome 

Gene clones* map positiont 



25 



20 AINTEGUMENTA ND 4-73 

TINY ND 5-32 to 5-45 

RAP2.1 yUP18H2, CIC11D10 ND* 

RAP2.2 yUP6Cl 38« 

yUP12G6, yUP24B8. 
yUP23Ell, CIC4H5, 

RAP2.3 CIC12C2 3-21 

30 RAP2.4 CIC7D2, CIC10C4 ND* 

RAP2.7 yUPlOEl ND* 

CIC10G7 1-94 to 1-103 5 

CIC9E12 1-117* 



35 



RAP2.8 
RAP2.9 

RAP2.1(fl ND 4-73 



40 * YAC clones were determined to contain the specified RAP2 gene by PCR-based DNA synthesis 

using gene-specific primers (Green, E. D., etal., 1990, Proc. Nail. Acad Sci. USA 87, 1213-1217; Kwiatkowski, T. 
J., et al., 1990, Nucleic Acids Res. 18, 7191-7192). 

t Chromosome map positions are given with reference to the Arabidopsis unified genetic map (AAtDB 

4-7). 

45 X YAC-based map position is ambiguous. 

J Preliminary map position is based on a single contact with the physical map. 



GenBank accession numbers for complete EST sequences for RAP2 and other 
genes are as follows: AINTEGUMENTA (U40256/U41339); TINY, (X94598), RAP2.1 



n««f »l'ff ?»■'■.» fill 1.1 V.t*W1 /T»TTT X!* 
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(AF003094), RAP2.2 (AF003095), RAP2.3 (AF003096), RAP2.4 (AF003097), RAP2.5 
(AF003098),RAP2.6 (AF003099), RAP2.7 (AF003100), RAP2.8 (AF003101), RAP2.9 
(AF003102), RAP2.10 (AF003103), RAP2.11 (AF003104), and RAP2.12 (AF003105). 
All RAP2 cDNA clones were originally reported with partial sequences and given 
GenBank accession numbers as shown in parentheses following each gene name: RAP2.1 
(Z27045), RAP2.2 (Z26440). RAP2.3 (TO4320 and T13104), RAP2.4 (T13774), RAP2.5 
(T45365), RAP2.6 (T45770), RAP2. 7 (T20443), RAP2.8 (Z33865), RAP2.9 (Z37270), 
RAP2.10 (T76017), RAP2.11 (T42962), and RAP2.12 (T42544). Due to the preliniinary 
nature of the EST sequence data, the predicted amino acid sequences for EST Z27045, 
T04320, T13774, and T42544 contained several errors and were incorrectly reported 
(Ohme-Takagi, et al, 1995, Plant Cell 7, 173-182; Klucher, K M., et al, 1996, Plant 
Cell 8, 137-153; Wilson, K., et al, 1996, Plant Cell 8, 659-671; Ecker, J. R., 1995, 
Science 268, 667-675; Weigel, D., 1995, Plant Cell 7, 388-389). They are correctly 
given in the GenBank accession numbers noted above. 

RAP2 cDNA sequence comparison also shows that there are at least two 
branches to the RAP2 gene family tree. The AP2-like and EREBP-like branches are 
distinguished by the number of AP2 domains contained within each polypeptide and by 
sequences within the conserved YRG element. The AP2-like branch of the RAP2 gene 
family is comprised of three genes AP2, ANT, and RAP2. 7, each of which encodes a 
protein containing two AP2 domains (Fig. 4A). In addition, these proteins possess a 
conserved WEAR/WESH amino acid sequence motif located in the YRG element of both 
AP2 domain repeats (Fig. 4/4). By contrast, genes belonging to the EREBP-like branch 
of the RAP2 gene family encode proteins with only one AP2 domain and include 
RAP2.1-RAP2.6, RAP2.8-RAP2.12, and TINY (Fig. 45). Proteins in this class possess a 
conserved 7-amino acid sequence motif referred to as the WAAEIRD box (Fig. 45) in 
place of the WEAR/WESH motif located in the YRG element (Fig. 44). Based on these 
comparisons, separate AP2 domain consensus sequences for both classes of RAP2 
proteins were generated (Fig. 4 A and B). These results suggest that the AP2 domain 
and specific sequence elements within the AP2 domain are important for RAP2 protein 
functions. 

The AP2-like class of RAP2 proteins is also characterized by the presence of 
a highly conserved 25-26 amino acid linker region that lies between the two AP2 domain 
repeats (Klucher, KM., et al., 1996, Plant Cell 8, 137-153). This region is 40% 
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identical and 48% similar between AP2, ANT and RAP2.7 and is not found in proteins 
belonging to the EREBP-like branch of RAP2 proteins. Molecular analysis of the ani-3 
mutant allele showed that the invariant C-terminal glycine residue within this linker 
region is essential for ANT function in vivo (Klucher, K M., et al., 1996, Plant Cell 8, 
137-153), suggesting that the linker region may also play an important role in AP2 and 

RAP2.7 function. 

Sequences Within the RAYD Element are Predicted to Form 
Amphipathic a-Helices. As noted above, the 18-amino acid core region within the 
RAYD element of the AP2 domain in AP2 is predicted to form an amphipathic a-helix 
that may be important for AP2 structure or function. Secondary structure prediction 
analysis was used to determine whether this structure has been conserved in RAP2 
proteins. As shown in Fig. 4, the core region represents the most highly conserved 
sequence block in the RAYD element of AP2 and the RAP2 proteins. Secondary 
structure analysis predicts that all RAP2 proteins contain sequences within the RAYD 
element that are predicted to form amphipathic a-helices (Fig. 4 A and B). Fig. AC 
shows that sequences in RAP2.7-R1 are predicted to form an amphipathic a-helix that is 
100% identical to that predicted for AP2-R1 and 63% similar to that predicted for 
ANT-R1. Sequences within the AP2 domain of EREBP-like RAP2 proteins are predicted 
to form similar a-heUcal structures. Fig. 4D shows that the RAP2.2, RAP2.5, and 
RAP2.12 a-helices are 81, 100, and 81% similar to that predicted for EREBP-3, 
respectively. Together, these results strongly suggest that the predicted amphipathic a- 
helix in the RAYD element is a conserved structural motif that is important for AP2 
domain function in all RAP2 proteins. 

RAP2 Genes are Expressed in Floral and Vegetative Tissues. Previous 
studies have shown that AP2 and ANT are differentially expressed at the RNA level 
during plant development (Jofuku, K. D., et al., 1994, Plant Cell 6, 1211-1225; Elliot, 
R. C, et al., 1996, Plant Cell 8, 155-168; Klucher, K M., et al., 1996, Plant Cell 8, 
137-153). AP2 is expressed at different levels in developing flowers, leaves, 
inflorescence stems, and roots. To determine where in plant development the 
EREBP-like class of RAP2 genes are expressed RAP2.1, RAP2.2, RAP2.3, and RAP2.4 
gene-specific probes were reacted with a mRNA gel blot containing flower, leaf, 
inflorescence stem, and root polysomal poly(A) mRNA. Results from these experiments 
showed that each RAP2 gene produces a uniquely sized mRNA transcript and displays a 

nmO'i'I' I'l C1TP17T /T>TTT IT K\ 
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distinct pattern of gene expression in flowers, leaves, inflorescence stems, and roots. 
For example, the RAP2.1 gene is expressed at low levels in wild-type flower, leaf, stem, 
and root. RAP2.2 gene expression appears to be constitutive in that RAP2.2 transcripts 
are detected at similar levels in wild-type flower, leaf, stem, and root. By contrast, the 
RAP2.3 gene is expressed at a low level in wild-type flowers, at a slightly higher level in 
leaves, and is relatively highly expressed in both stems and roots. Finally, the RAP2.4 
gene is also expressed in wild-type flower, leaf, stem, and root and is most highly 
expressed in roots and leaves. These data indicate that individual members of the 
EREBP-like family of RAP2 genes: are expressed at the mRNA level in both floral and 
vegetative tissues and show quantitatively different patterns of gene regulation. 

RAP2 Gene Expression Patterns are Affected by ap2. RAP2 gene 
expression was analyzed in ap2-10 mutant plants by RNA gel blot analysis to determine 
whether AP2 is required for RAP2 gene expression. The expression of three RAP2 
genes are differentially affected by the loss of AP2 function. For example, RAP2.2 gene 
expression is not dramatically altered in mutant flowers, leaves, and roots compared to 
wild-type Landsberg erecta but is down-regulated in mutant stem. RAP2.3 gene 
expression appears unchanged in mutant roots but is up-regulated in mutant flowers and 
leaves and down-regulated in mutant stems. By contrast, RAP2.4 gene expression 
appears relatively unchanged in mutant stems and roots but is slightly up-regulated in 
mutant flowers and leaves. To control for possible secondary effects of ecotype on 
RAP2 gene expression, RAP2 gene expression levels in wild-type C24 and ap2-10 
mutant stems were compared. These results show that the differences in RAP2.2 
RAP2.3, and RAP2.4 gene expression in C24 and ap2-10 stem are similar to those 
observed between wild-type Landsberg erecta and ap2-10 mutant stem. Together these 
results suggest that AP2 directly or indirectly regulates the expression of at least three 
RAP2 genes. More importantly, these results suggest that AP2 is controlling gene 
expression during both reproductive and vegetative development. 

DISCUSSION 

RAP2 Genes Encode a New Family of Putative DNA Binding Proteins. 
One important conclusion from the characterization of these clones is that the AP2 
domain has been evolutionarily conserved in at least Arabidopsis and tobacco. In 
addition, there are two subfamilies of AP2 domain containing proteins in Arabidopsis that 
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are designated as the AP2-like and the EREBP-like class of RAP2 proteins. In vitro 
studies have shown that both the EREBP and the AP2 proteins bind to DNA in a 
sequence specific manner and that the AP2 domain is sufficient to confer EREBP DNA 
binding activity (Ohme-Takagi, et al, 1995, Plant Cell 7, 173-182). From these results 
and the high degree of sequence similarity between the AP2 domain motifs in AP2, the 
EREBPS, and the RAP2 proteins, it is concluded that RAP2 proteins function as plant 
sequence specific DNA binding proteins. Although the exact amino acid residues within 
the AP2 domain required for DNA binding have not yet been identified, sequence 
comparisons have revealed two highly conserved motifs referred to as the YRG and 
RAYD elements within the AP2 domain. 

The RAYD element is found in all known AP2 domains and contains a 
conserved core region that is predicted to form an amphipathic a-helix (Fig. 4). One 
hypothesis for the function of this a-helical structure is that it is involved in DNA 
binding, perhaps through the interaction of its hydrophobic face with the major groove of 
DNA (Zubay, G., et al, 1959, /. Mol. Biol 7, 1-20). Alternatively, this structure may 
mediate protein-protein interactions important for RAP2 functions. These interactions 
may involve the ability to form homo- or heterodimers similar to that observed for the 
MADS box family of plant regulatory proteins (Huang, H., et al, 1996, Plant Cell 8, 
81-94; Riechmann, J. L, et al., 1996, Proc. Natl Acad. Sci. USA 93, 4793-4798) and 
for the mammalian ATF/CREB family of transcription factors (Hai, T., et al, 1991, 
Proc. Natl Acad. Sci. USA 88, 3720-3724; O'Shea, E. K, et al, 1992, Cell 68, 
699-708.). 

The conserved YRG element may also function in DNA binding due to the 
highly basic nature of this region in all RAP2 proteins (Fig. 4). However, the YRG 
element also contains sequences that are specific for each class of RAP2 protein and may 
be functionally important for DNA binding. Specifically, the WAAIERD motif is highly 
conserved in tobacco EREBPs and in EREBP-like RAP2 proteins. By contrast, the 
WEAR/WESH motif replaces the WAAIERD box in AP2-like RAP2 proteins (Fig.4). In 
vitro studies suggest that the EREBPs and AP2 recognize distinct DNA sequence 
elements (Ohme-Takagi, et al, 1995, Plant Cell 7, 173-182). It is possible that the 
WAAIERD and WEAR/WESH motifs may be responsible for DNA binding sequence 
specificity. The presence of two AP2 domains in AP2 may also contribute to differences 
in sequence specificity. Although the molecular significance of having one or two AP2 
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domain motifs is not yet known, genetic and Molecular studies have shown that muta- 
tions in either AP2 domain affect AP2 function, implying that both are required for 
wild-type AP2 activity (Jofuku, K. D., et al, 1994, Plant Cell 6, 1211-1225). 

In addition to Arabidopsis and tobacco, cDNAs that encode diverse AP2 
domam-containing proteins have been found in maize, rice, castor bean, and several 
members of the Brassicaceae including canola (Ohme-Takagi, et al, 1995, Plant Cell 7, 
173-182; Elliot, R. C, et al, 1996, Plant Cell 8, 155-168; Klucher, K M., et al, 1996, 
Plant Cell 8, 137-153; Wilson, K., et al, 1996, Plant Cell 8, 659-671 and Weigel, D., 
1995, Plant Cell 7, 388-389). This strongly suggests that the AP2 domain is an 
important and evolutionarily conserved element necessary for the structure or function of 
these proteins. 

RAP2 Gene Expression in Floral and Vegetative Tissues. The AP2, 
RAP2.J, RAP2.2, RAP2.3, and RAP2.4 genes show overlapping patterns of gene 
expression at the mRNA level in flowers, leaves, inflorescence stems, and roots. How- 
ever, each gene appears to be differentially regulated in terms of its mRNA prevalence. 
The overlap in RAP2 gene activity could affect the genetic analysis of AP2 and RAP2 
gene functions if these genes are also functionally redundant. For example, in flower 
development AP2 and ANT show partially overlapping patterns of gene expression at the 
organ and tissue levels (Jofuku, K. D., et al., 1994, Plant Cell 6, 1211-1225; Elliot, R. 
C, et al., 1996, Plant Cell 8, 155-168; Klucher, K M., et al, 1996, Plant Cell 8, 
137-153; W. Szeto). From single and double mutant analysis it has also been suggested 
that AP2 may be partially redundant in function with ANT (Elliot, R. C, et al, 1996, 
Plant Cell 8, 155-168). The phenomenon of genetic redundancy and its ability to mask 
the effects of gene mutation is more clearly demonstrated by the MADS domain 
containing floral regulatory genes APETALA1 (API) and CAULIFLOWER (CAL). 
Genetic studies have demonstrated that mutations in cal show no visible floral phenotype 
except when in double mutant combination with apl (Bowman, J. L, et al, 1993, 
Development Cambridge, U.K, 119, 721-743), indicating that API is completely 
redundant in function for CAL. The hypothesis that the RAP2 genes may have 
genetically redundant functions is supported by the fact that the dominant gain-of-function 
mutation tiny is the only Arabidopsis RAP2 EREBP-like gene mutant isolated to date 
(Wilson, K., et al., 1996, Plant Cell 8, 659-671). 
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AP2 Activity Is Detectable in Vegetative Development. The present 
analysis of RAP2 gene expression in wild-type and ap2-10 plants suggests that AP2 
contributes to the regulation of RAP2 gene activity throughout Arabidopsis development. 
RAP2 gene expression is both positively and negatively affected by the absence of AP2 

5 activity during development. The observed differences in RAP2.2, RAP2.3, and RAP2.4 
gene expression levels in wild-type and ap2-10 flowers and vegetative tissues are not 
apparently due to differences in ecotype because similar changes in gene expression 
levels were observed for all three RAP2 genes in stems when ecotype was controlled. 
The regulation of RAP2 gene expression by AP2 in stems clearly indicates that unlike 

10 other floral homeotic genes AP2 functions in both reproductive and vegetative 
development. 

Example 5 

This example shows that transgenic plants of the invention bear seed with 

15 altered fatty acid content and composition. 

Antisense transgenic plants were prepared using AP2, RAP2.8, and RAP2.1 
(two independent plants) using methods described above. The fatty acid content and 
composition were determined using gas chromatography as described Broun and 
Somerville Plant Physiol. 1 13:933-942 (1997). The results are shown in Table VI (for 

20 AP2) and Table VII (for the RAP2 genes). As can be seen there, the transgenic plants of 
the invention have increased fatty acid content as compared to wild-type plants. In 
addtion, the profile of fatty acids is altered in the plants. 

The results shown in Table VHI reveal that there is an approximately 7 mg 
of oil in a wild-type Arabidopsis seed. By contrast, there is approximately 9 mg in an 

25 ap2-4 and ap2-5 seed and approximately 14-15 mg in an ap2-10 seed. In addition, the 
spectrum of fatty acids in wild-type ans ap2 mutant seeds are quantitatively 
indistinguishable. Thus, loss of AP2 activity increases total fatty acid content without 
detectable changes in fatty acid composition. 
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Example 6 

This example describes construction of promoter construct which are used to 
prepare expression cassettes useful in making transgenic plants of the invention. In 
particular, this example shows use of two preferred promoters, the promoter from the 
API gene and the promoter from the Bell gene. 

Figure 5 shows a AP2 promoter construct. pAP2 represents the 16.3 kb AP2 
promoter vector cassette that is used to generate chimeric genes for use in plant 
transformations described here. pAP2 is comprised of the 4.0 kb promoter region of the 
Arabidopsis AP2 gene. The Ti plasmid vector used is pDElOOO vector (Plant Genetic 
Systems, Ghent, Belgium). The pDElOOO vector DNA was linearized with BamHl and 
the AP2 promoter region inserted as a 4.0 kb BamHl DNA fragment from plasmid 
subclone pLE7.2. At the 3' end of the inserted AP2 promoter region, designated AP2, 
lie three restriction sites (EcoRl, Smal and SnaBl) into which different gene coding 
regions can be inserted to generate chimeric AP2 promoter/gene cassettes. NOS::NPTII 
represents the plant selectable marker gene NPTII under the direction of the nopaline 
synthase promoter which confers resistance to the antibiotic kanamycin to transformed 
plants cells carrying an integrated AP2 promoter cassette. LB and RB represent the 
T-DNA left and right border sequences, respectively, that are required for transfer of 
T-DNA containing the AP2 promoter cassette into the plant genome. PVS1 designates 
the bacterial DNA sequences that function as a bacterial origin of replication in both E. 
coli and Agrobacterium tumefaciens, thus allowing pAP2 plasmid replication and 
retention in both bacteria. Amp R and Sm/Sp R designate bacterial selectable marker 
genes that confer resistance to the antibiotics ampicillin and streptomycin/spectinomycin, 
respectively, and allows for selection of Agrobacterium strains that carry the pAP2 
recombinant plasmid. 

Figure 6 shows a BEL1 promoter construct. pBELl represents the 16.8 kb 
BEL1 promoter vector cassette that is used to generate chimeric genes for use in plant 
transformations described here. pBELl is comprised of the 4.5 kb promoter region of 
the Arabidopsis BEL1 gene. The Ti plasmid vector used is pDElOOO vector (Plant 
Genetic Systems, Ghent, Belgium). The pDElOOO vector DNA was linearized will 
BamHl and the BEL1 promoter region inserted as a 4.5 kb BamHl-Bgl2 DNA fragment 
from plasmid subclone p\lC9R (L. Reiser, unpublished). At the 3' end of the inserted 
BEL1 promoter region, designated BEL1, lie three restriction sites (EcoRl, Smal and 
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SnaBl) into which different gene coding regions can be inserted to generate chimeric 
BEL promoter/gene cassettes. NOS::NPTII represents the plant selectable marker gene 
NPTII under the direction of the nopaline synthase promoter which confers resistance to 
the antibiotic kanamycin to transformed plants cells carrying an integrated BEL promoter 
cassette. LB and RB represent the T-DNA left and right border sequences, respectively, 
that are required for transfer of T-DNA containing the BEL promoter cassette into the 
plant genome. PVS1 designates the bacterial DNA sequences that function as a bacterial 
origin of replication in both E. coli and Agrobacterium tumefaciens, thus allowing pBEL 
plasmid replication and retention in both bacteria. Amp R and Sm/Sp R designate bacterial 
selectable marker genes that confer resistance to the antibiotics ampicillin and 
streptomycin/spectinomycin, respectively, and allows for selection of Agrobacterium 
strains that carry the pBEL recombinant plasmid. 

Example 7 

This example shows that transgenic plants of the invention have increased 

seed yield. 

It is widely known that seed filling and the deposition of total seed contents is 
determined in part by the availability and supply of carbon- and mtrogen-containing 
compounds or assimilates to the developing seed. Thus, an increase in seed size and 
seed contents typically result from a decrease in total seed number in the presence of a 
fixed supply of photoassimilates. Since total seed number is dependent on many factors 
including both male and female fertility and since ap2 mutations affect both ovule and 
stamen development, total seed number and total seed yield was measured intransgenic 
plants of the invention and in wild-type plants to determine whether the increase in seed 
size results at the expense of total seed number or seed yield. 

To test this hypothesis directly, total seed yield for individual ap2-10 (+/-) 
plants was determined. As shown in Table IX total seed yield per ap2-10 (+/-) plant is 
increased by 35% when compared to wild-type C24, due in part to an increase in average 
seed mass (Table DC). In addition, increases in total seed yield in ap2-10 transgenic 
plants may result from an increase hrthe total number of flowers produced. Table X 
shows that, on average, ap2-10 (-/-) plants produce at least 80% more flowers than wild- 
type. Thus genetically manipulating AP2 activity in transgenic plants allows for 
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agriculturally desirable increases in total seed yield by increasing seed mass, seed 
contents, and number of flowers produced. 

Table IX. Genetic control of Arabidopsis seed yield by AP2. 



Average seed mass 
(mg per 100 seed) 



Total seed yield Percent change in 

(g) 1 yield compared to 

wild-type 



10 



15 



1. ap2-10 (+/-) 2.9 

2. C24 2.1 



2.61 (0.39) 
1.94 (0.27) 



+35% 



l n = 10. Standard deviation values are given in parentheses. 



Table X. Genetic control of Arabidopsis flower number by AP2. 



20 



Total # of inflorescences 
(per plant) 1 



Average number of flowers 
on primary inflorescence 
(per plant) 1 



1. ap2-10 (-/-) 14.7 (3.9) 78.7 (6.0) 

2. C24 11.0 (3.6) 42.9(7.6) 
25 _____ 

l n = 7. Standard deviation values are given in parentheses. 



The above examples are provided to illustrate the invention but not to limit 
30 its scope. Other variants of the invention will be readily apparent to one of ordinary skill 
in the art and are encompassed by the appended claims. All publications, patents, and 
patent applications cited herein are hereby incorporated by reference for all purposes. 
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1 1 . A method of modulating seed mass in a plant, the method 

2 comprising: 

3 providing a first plant comprising a recombinant expression cassette 

4 containing an ADC nucleic acid linked to a plant promoter; 

5 selfmg the first plant or crossing the first plant with a second plant, 

6 thereby producing a plurality of seeds; and 

7 selecting seed with altered mass. 

1 2. The method of claim 1, wherein expression of the ADC nucleic 

2 acid inhibits expression of an endogenous ADC gene and the step of selecting includes 

3 the step of selecting seed with increased mass. 

1 3. The method of claim 2, wherein the seed have increased protein 

2 content, carbohydrate content, or oil content. 

1 4. The method of claim 2, wherein the ADC nucleic acid is linked to 

2 the plant promoter in the antisense orientation. 

1 5. The method of claim 2, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in Genbank 

3 Accession Nos. U12546, AF003101, and AF003094. 

1 6. The method of claim 2, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in SEQ ID 

3 NO:l. 

1 7. The method of claim 2, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in SEQ ID 

3 NO:2. 
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1 8. The method of claim 2, wherein the ADC nucleic acid is selected 

2 from a group consisting of Genbank accession numbers U 12546, AF003094, 

3 AF003095, AF003096, AF003097, AF003098, AF003099, AF003100, AF003101, 

4 AF003102, AF003103, AF003104, and AF003105. 

1 9. The method of claim 2, wherein the first and second plants are 

2 the same species. 

1 10. The method of claim 2, wherein the first and second plants are 

2 members of the family Brassicaceae. 

1 11. The method of claim 2, wherein the first and second plants are 

2 members of the family Solanaceae. 

1 12. The method of claim 2, wherein the plant promoter is a 

2 constitutive promoter. 

1 13. The method of claim 12, wherein the promoter is a CaMV 35S 

2 promoter. 

1 14. The method of claim 2, wherein the promoter is a tissue-specific 

2 promoter. 

1 15. The method of claim 14, wherein the promoter is ovule-specific. 

1 16. A seed produced by the method of claim 2. 

1 17. The method of claim 1, wherein expression of the ADC nucleic 

2 acid enhances expression of an endogenous ADC gene and the step of selecting 

3 includes the step of selecting seed with decreased mass. 
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1 18. The method of claim 17, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 Genbank Accession Nos. U12546, AF003101, and AF003094. 

1 19. The method of claim 17, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 SEQIDNO.l. 

1 20. The method of claim 17, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 SEQ ID NO:2. 

1 21. The method of claim 17, wherein the ADC nucleic acid is 

2 selected from a group consisting of Genbank accession numbers U12546, AF003094, 

3 AF003095, AF003096, AF003097, AF003098, AF003099, AF003100, AF003101, 

4 AF003102, AF003103, AF003104, and AF003105. 

1 22. The method of claim 17, wherein the first and second plants are 

2 the same species. 

1 23. The method of claim 17, wherein the first and second plants are 

2 members of the family Brassicaceae. 

1 24. The method of claim 17, wherein the first and second plants are 

2 members of the family Solanaceae. 

1 25. The method of claim 17, wherein the plant promoter is a 

2 constitutive promoter. 

1 26. The method of tlaim 25, wherein the promoter is a CaMV 35S 

2 promoter. 



WO 99/41974 ^ PCT7US99/03429 

1 27. The method of claim 17, wherein the promoter is a tissue-specific 

2 promoter. 

1 28. The method of claim 27, wherein the promoter is ovule-specific. 

1 29. A seed produced by the method of claim 17. 

1 30. A seed comprising a recombinant expression cassette containing 

2 an ADC nucleic acid. 

1 31. The seed of claim 30, which is derived from a plant that is a 

2 member of the family Brassicaceae. 

1 32. The seed of claim 30, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in Genbank 

3 Accession Nos. U12546, AF003101, and AF003094. 

1 33. The seed of claim 30, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in SEQ ID 

3 NO:l. 

1 34. The seed of claim 30, wherein the ADC nucleic acid hybridizes 

2 under stringent conditions to a nucleic acid having a sequence as set forth in SEQ ID 

3 NO:2. 

1 35. The seed of claim 30, wherein the ADC nucleic acid is selected 

2 from a group consisting of Genbank accession numbers U 12546, AF003094, 

3 AF003095, AF003096, AF003097, AF003098, AF003099, AF003100, AF003101, 

4 AF003102, AFOO3103, AF003104, and AF003105. 

1 36. The seed of claim 30, wherein the ADC nucleic acid is linked to 

2 a plant promoter in an antisense orientation and the seed mass is at least about 10% 
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3 greater than the average mass of seeds from the same plant variety which lack the 

4 recombinant expression cassette. 

1 37. The seed of claim 36, wherein the mass is at least about 20% 

2 greater than the average mass of seeds from the same plant variety which lack the 

3 recombinant expression cassette. 

1 38. The seed of claim 36, wherein the mass is at least about 50% 

2 greater than the average mass of seeds from the same plant variety which lack the 

3 recombinant expression cassette. 

1 39. The seed of claim 36, wherein the oil content is proportionally 

2 increased. 

1 40. The seed of claim 36, wherein the protein content is 

2 proportionally increased. 

1 41. The seed of claim 30, wherein the ADC nucleic acid is linked to 

2 a plant promoter in the sense orientation and the seed mass is at least about 10% less 

3 than the average mass of seeds of the same plant variety which lack the recombinant 

4 expression cassette. 

1 42. The seed of claim 41, which has a mass at least about 20% less 

2 than the average mass of seeds of the same plant variety which lack the recombinant 

3 expression cassette. 

1 43. The seed of claim 41, which has a mass at least about 50% less 

2 than the average mass of seeds of the same plant variety which lack the recombinant 

3 expression cassette. 

1 44. A transgenic plant comprising an expression cassette containing a 

2 plant promoter operably linked to a heterologous ADC polynucleotide. 
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1 45. The transgenic plant of claim 44, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 Genbank Accession Nos. U12546, AF003101, and AF003094. 

1 46. The transgenic plant of claim 44, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 SEQIDNO.l. 

1 47. The transgenic plant of claim 44, wherein the ADC nucleic acid 

2 hybridizes under stringent conditions to a nucleic acid having a sequence as set forth in 

3 SEQ ID NO:2. 

1 48. The transgenic plant of claim 44, wherein the ADC 

2 polynucleotide is selected from a group consisting of Genbank accession numbers 

3 U12546, AF003094, AF003095, AF003096, AF003097, AF003098, AF003099, 

4 AF003100, AF003101, AF003102, AF0O31O3, AF003104, and AF003105. 

1 49. The transgenic plant of claim 44, wherein the heterologous ADC 

2 polynucleotide encodes a ADC polypeptide. 

1 50. The transgenic plant of claim 44, wherein the heterologous ADC 

2 polynucleotide is linked to the promoter in an antisense orientation. 

1 51. The transgenic plant of claim 44, which is a member of the genus 

2 Brassica. 

1 52. An isolated nucleic acid molecule comprising an expression 

2 cassette containing a plant promoter operably linked to a heterologous ADC 

3 polynucleotide. 

1 53. The isolated nucleic acid molecule of claim 52, wherein the ADC 

2 nucleic acid hybridizes under stringent conditions to a nucleic acid having a sequence 

3 as set forth in Genbank Accession Nos. U12546, AF003101, and AF003094. 
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1 54. The isolated nucleic acid molecule of claim 52, wherein the ADC 

2 nucleic acid hybridizes under stringent conditions to a nucleic acid having a sequence 

3 as set forth in SEQ ID NO: 1 . 

1 55. The isolated nucleic acid molecule of claim 52, wherein the ADC 

2 nucleic acid hybridizes under stringent conditions to a nucleic acid having a sequence 

3 as set forth in SEQ ID NO:2. 

1 56. The isolated nucleic acid molecule of claim 52, wherein the ADC 

2 polynucleotide is selected from a group consisting of Genbank accession numbers 

3 U12546, AF003094, AF003095, AF0O3O96, AF003097, AF003098, AF003O99, 

4 AF003100, AF003101, AF003102, AF003103, AF003104, and AF003105. 

1 57. The isolated nucleic acid of claim 52, wherein the heterologous 

2 ADC polynucleotide encodes a ADC polypeptide. 

1 58. The isolated nucleic acid of claim 52, wherein the heterologous 

2 ADC polynucleotide is linked to die promoter in an antisense orientation. 

1 59. The isolated nucleic acid of claim 52, which is a member of the 

2 genus Brassica. 
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Figure 5. Schematic diagram of pAP2 
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Figure 6. Schematic diagram of pBEL1 
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SEQUENCE LISTING 



<110> Jofuku, K. Diane 
Okamuro, Jack K. 

The Regents of the University of California 

<120> Methods for Improving Seeds 

<130> 02307O-067230PC 

<140> WO PCT/US99/03429 
<141> 1999-02-17 

<150> US 09/026,039 
<151> 1998-02-19 

<160> 104 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 1669 
<212> DNA 

<213> Brassica napus 
<220> 

<223> canola (rape) APETALA2 (AP2) domain containing 
(ADC) gene sequence 

<220> 

<221> modified_base 

<222> (5) 

<223> n = unknown 

<220> 

<221> modified_base 

<222> (7) 

<223> n = unknown 

<220> 

<221> modified_base 

<222> (15) 

<223> n = unknown 

<220> 

<221> modif ied_base 

<222> (17) 

<223> n = unknown 

<220> 

<221> modif ied_base 

<222> (20) 

<223> n = unknown 

<220> 

<221> modified_base 

<222> (70) 

<223> n = unknown 
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<220> 

<221> modified_base 

<222> (435) 

<223> n = unknown 



<220> 

<221> modified_base 
<222> (1051) 
<223> n = unknown 



<400> 1 

cagcngngtt ttccntnatn gttcgtggcg, 
caaattcatn gtcaccctac aagaaagggg 
aggagaagct caatagtaca agaaattaaa 
tttctcatcc cataatagtt ttccagtaaa 
ttatatataa agataccttg tggtgtttat 
tattacattt gtattgggaa ataatcaaat 
actatatact ttttcgttga taaatttttt 
agcacatata tattncctaa tgtgattttc 
tatttattcc atcccatctt attgcttttg 
ttattgctct catatttctt tgcttttgtt, 
ccttaacgac tcacctgatc accacgaaga 
tgtaccaatc tctatgagat catcgaccac 
gatatttttt tccgaatcaa atcatggaac 
tcactagaaa ccagtctctt gttcggtcga 
tagagctgga gataacacag ccggtaaaaa 
ctcagtatag aggagttact ttttatcgac 
aacttaattt tcttaacccg acgatatacc 
atcaaataca tgtttcattt catttgagcc 
tggaatctta tgcagggact gcgggaagca 
ttcaaacaca gatcaaatat cctattgaaa 
atgatttctt cgaccaaata aaggttttat 
tcaggtggat ttgacacagc acatgccgct 
tcaattagaa cgaatctaat attccttatt 
ggtgggtaac tgtttgggac agtgcctacg 
atgcagatat aaatttcaat attgaagact 
tattatttgt aggttcaagc aattgactta 
atttgttgca gatgacgcag ttgacaaagg 
gcactgggtt tccaagaggc agctctaagt 



gcccacgtgg taggagaaga cggaaattaa 60 
gaacataatt aaatttcgag tagttggagt 120 
taagatactc ccatctacat catcttgctt 180 
actgtgaacc ttgtgaattt aatttccctt 240 
actcaggaga ccagaaacta ggaaacagtc 300 
ctcaaaattt gattcattca taaactttat 360 
gcctctcctt ctaataacga atggagtcct 420 
atttcatatg gatatttatt tatatatgac 480 
atggttctct tgtcattaga gtcttctctc 540 
tcctctttat tacaagagag atatgtggaa 600 
atccgacggt agatggaaac gggctggaga 660 
gtgtctgtcg tctgttcctc ccgtgacccg 720 
aggaagttcc aggaatatct gggtcccgta 780 
atcctagcgg gtctggtcgt ccggaaaacc 840 
agagccgacg tggtcctcgc tcacggagct 900 
gaaccggaag atgggagtca catatttggt 960 
gaatactatt attacctata tggtaaatct 1020 
nataccgtat tgttgttttt aaaatatgtt 1080 
agtgtactta ggtatgatca tgtaatgttg 1140 
ctaagttgtg ttgtgtctgt ccatttttat 1200 
tatctcctta tattactttt tgttacatat 1260 
gctcggtatg ttttactcat ccaaatatga 1320 
ttgtaatttg ctgatataca aattaatttg 1380 
atagagccgc agttaagttt agaggtgtag 1440 
atgtggagga tttgaaacag gtaaaatatt 1500 
gattattact cgaacataaa acaaattaat 1560 
aagagttcat gcatgtcatt agaaggcaaa 1620 
atagaggtgt cactttgca 1669 



<210> 2 

<211> 803 

<212> DNA 

<213> Glycine max 



<220> 

<223> soybean APETALA2 (AP2) domain containing (ADC) 
gene sequence 

<400> 2 

ggattgtggg aaacaagttt atctaggtaa agttgattaa taacaataat tgtatatgtg 

tttgtgagaa ctgtggcagt tatttttcct aatattgttt taagaggcta aaacggtttt 120 

tttttccttg ttttgtgttt tttgtcttgg ctgtgatgcg gtagagacaa gagtgtgagt 1 
gtgtgttgtg tgtgggtgag gatttttttr tttttttgtg gtgactgact tgatggtttt 

tgtctgggta aaatttgtct aggtggattt gacacagcac atgcggctgc tcggtgagcc 300 

cttgccccct cctttagtat tataccaagc tfgtaatatt actttttcca tgtcttgaac 360 

caaatatcaa atattattgt aaatcacaft tcgttgtggg ccggggaatt gtgagtctca 420 

aagaaaattg tgtattttcc gtctctcttt tcagtgctta tgatagagcg gctattaaat 480 

tccgaggagt ggaggctgac attaacttca atattggaga ctatgaagat gacttgaagc 540 

aggtgatcaa tttgtggatt atgttttttt tattcgatat aaatgcattt atcgtattta 600 

tcttatcttg aacagtcata cgtataggat gcaccttatc tcctacagtt agtgtttttt 660 



60 



180 
240 
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tttatcttga attattctca tgattttgtt aaatgcaatg ttaatagatg agcaatctta 720 
ccaaggaaga gttcgtccac gtgcttcgcc gccaaagcac tggatttccg agaggaagct 780 
ccaagtatag aggtgtcact tgc 803 

<210> 3 
<211> 11721 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<223> Arabidopsis APETALA2 (AP2) complete genomic 
sequence 

<400> 3 

gtaagcaata atatagtttg aacaatgagt tcaaaagcct ttggttggaa taactcatta 60 
agaaaacaga tagtttaaaa tcaattgaat tatgatttca gacatagtat aagtttgggc 120 
aaagtatttt aatggaaata gaatacaaac ttgataacaa gttattgcca tttagaaaat 180 
gttataactc cttatatagt gagaacccta ttgcttgctt gtgttcaagg aattgctgtt 240 
tatggaactc cactttaaat aaacatataa ataatctatt tagtatcaaa aacttaatat 300 
cccatctact ttgaatttgc ccgatcatga aaacacctat ttattgtaca taaatataca 360 
tctttaacaa atatagatct atttgtattt gtatcttatt accttttatt catatagaaa 420 
aaggaaaaaa caaacaaaaa gctttgttcc aagagtttaa agtatatcaa aattggcagt 480 
attgtggggt tttgaagtaa tctactagat tattcatttt cttgcaaaaa acaccttttt 540 
atagttgctt accgaaccaa gccgtggact tcactttctt aaaacatgaa attacatatt 600 
aaaggaatct tttgattaat caaagaagat gtttgtatac atagataata aatgaaagtg 660 
gaaatatttt tttttaaaat aaagaaaaaa aataaagaaa ctccaatacg aggaattgtt 720 
tcgaatataa tcttttttcc aaagaaagca agtggttaga gagagaataa tattatttta 780 
ttaaattttt acaaaaaaaa cagtgacaag agaagagaga gagagaaagg gcagtggaag 840 
taaatataaa ggaaaggata aaaatgaaag ctttcgtaga agcaatctat caaattttta 900 
ttttattttc ttctctctct ctctttagct cttttttttt tgttttcatt aaagttttta 960 
ttttattttc taccaaccaa aagcttttct ctttggtttc tcttatttag cttctaacct 1020 
tgaggagaat cataccagag gattgaagtt tgaaccttca aagatcaaaa tcaagaaacc 1080 
aaaaaaaaac aaaaaaaatg tgggatctaa acgacgcacc acaccaaaca caaagagaag 114 0 
aagaatctga agagttttgt tattcttcac caagtaaacg ggttggatct ttctctaatt 1200 
ctagctcttc agctgttgtt atcgaagatg gatccgatga cgatgaactt aaccgggtca 1260 
gacccaataa cccacttgtc acccatcagt tcttccctga gatggattct aacggcggtg 1320 
gtgttgcttc tggctttcct cgggctcact ggtttggtgt taagttttgt cagtcggatc 1380 
tagccaccgg atcgtccgcg ggtaaagcta ccaacgttgc cgctgccgta gtggagccgg 1440 
cacagccgtt gaaaaagagt cggcgtggac caagatcaag aagttctcag tatagaggtg 1500 
ttacgtttta ccggcgtacc ggaagatggg aatctcatat ttggtaataa tctcatattt 1560 
ttaatttcgt taatcgatcg tactttagat tataaattta agtttttttt tgtttgttct 1620 
tctgaatttc agggactgtg ggaaacaagt ttacttaggt aattttattt tcctcatgtt 1680 
tttttttgta ttttggtgtt gaaaaatgtc atcataattt taatttatta taanctctga 1740 
ataggtggat ttgacactgc tcatgcagca gctcggtatt tttctctctt tgactctctc 1800 
tatattgagt tgttatttat ttattttttt aaaaataccg gaagaaattt ataaaaatta 1860 
attttaattt tgttttattt aatagagcat atgatagagc tgctattaaa ttccgtggag 1920 
tagaagcgga tatcaatttc aacatcgaag attatgatga tgacttgaaa caggtaaata 1980 
taaattataa actatattgg tttttattaa cgatttttaa aggtttggga gattaatatt 2040 
gaaattgaat tttatagatg actaatttaa ccaaggaaga gttcgtacac gtacttcgcc 2100 
gacaaagcac aggcttccct cgaggaagtt cgaagtatag aggtgtcact ttgcataagt 2160 
gtggtcgttg ggaagctcga atgggtcaat tcttaggcaa aaagtataat ttctctcatt 2220 
ttatattcac tcgaaaactt catttttagt ttgttatttt aactttgagt ttttgtttct 2280 
tgaatcttat aaaataggta tgtttatttg ggtttgttcg acaccgaggt cgaagctgct 234 0 
aggtaaatgt ctttttgttt gattctacaa cacacattgt tgtataatgt gtttttctcg 2400 
ttactaattg attttcatta ttttatatat aatcacagag cttacgataa agctgcaatc 2460 
aaatgtaacg gcaaagacgc cgtgaccaac tftgatccga gtatttacga tgaggaactc 2520 
aatgccggta aattgtctca tttaatcgag taattttata tattttttgg tccttagttt 2580 
catcactttg gtgttcgaac ttggtttaaa gattttgaat ttggtgtata tagagtcatc 2640 
agggaatcct actactccac aagatcacaa cctcgatttg agcttgggaa attcggctaa 2700 
ttcgaagcat aaaagtcaag atatgcggct cagggtaggg tttaatctta tattattaac 2760 
aataatttat atcttaatat attgtttata tgtttataaa catgttttct tttgttttgc 2820 
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tttcagatga accaacaaca acaagattct ctccactcta atgaagttct tggattaggt 2880 
caaaccggaa tgcttaacca tactcccaat tcaaaccacc aagtgagtaa ataaccacaa 2940 
atgcaaatac cataatttca tttgaatata ttttatctaa agaattgcat ttttttttgg 3000 
taaattagtt tccgggcagc agcaacattg gtagcggagg cggattctca ctgtttccgg 3060 
cggctgagaa ccaccggttt gatggtcggg cctcgacgaa ccaagtgttg acaaatgctg 3120 
cagcatcatc aggattctct cctcatcatc acaatcagat ttttaattct acttctactc 3180 
ctcatcaaaa ttggctgcag acaaatggct tccaacctcc tctcatgaga ccttcttgaa 3240 
tcttttatat ttttaaggtt tattattata taagaaaaac aaaaatgaac ctttgaaatc 3300 
cccacatgtt cttggtcatt tcattaatca tcggcttata ttttgcttat tttcccctaa 3360 
atcctcttgt taacttaggc gaacaaaaaa aattaatgga aatctttttc cctccatcgg 3420 
ttacaaaaat aatattatat ataattgttg gatatatggg aaattggata agtttgtgat 3480 
ttgagatgtt ctgactaaaa aggttgagaa gagatttgtc aatgagttgt cttttgttgt 3540 
ttctcctcaa tacatttatt aagattttaa aacattactt ctttatatgt tacctgcacc 3600 
tatctacata tatgttcagt cctaacttgt tttgttattc ctctttcata tctattcaat 3660 
taatgttttc ctagccttag ttcattttac atttttcttg aaaatctctc atgaaaaaaa 3720 
cacattcatg tgtgaaatat atttcaacac ccattatagt ttcgttaatt cagatatata 3780 
atttttattt attacatata ataaaaattg acaggtgggt atacacatgg tttccttgtg 3840 
ctaacattgg tttgaatagc ataaaccgaa tcctaaaata ctttaagttc acttcgtaaa 3900 
taaaatctga taaactgaac aaattaggtt ctctaaagtt gagatgggta aatgttcagc 3960 
taactcatgg agttggaatt gtgattctcg cttgttcaca attgcttttg gtattcacaa 4020 
ggatactaat tgtatgcttt gtttgtggtt gtgctttgtt tgccattgtt actcctttat 4080 
ggagatctat gttctatctt gtncccctgt tgttgggaaa tctttatttg cagggaatga 4140 
aagcttcttt ggtgtttatg gatttgctgt tgcatcttgt ttggacatgc tatcttctaa 4200 
aacccttgaa gttcccttgc gagacatttc tatcgcttat cgttctagca cctacacgtt 4260 
ggtagcagaa gatattgtag cgaaaactgc cttatcattg aataacgccc tcttgttggc 4320 
gtggaacgaa gactctttgg ttttgcaatt gttcatggac tcccagttca gtgtaaaatc 4380 
tttcaagtca ctcgaagtca ccgttgagct agggagaatt ctgcttgttt taagcttctt 4440 
tcaccttcgg tttatcccca tgttttttgc attattctct tatactgcgt ttgtgtttgc 4500 
agttctttca gtttttagtg atgtgtttta ctcttgtaat ggagactaaa ttatttctat 4560 
atctataaaa ttagtttgat aaaaaaaaaa aaactgaaca aattttattt cgttcagcaa 4 620 
cgcttataaa tgccaaatag aaaagttcag ttattttaaa gcttttttgc atatatatat 4 680 
tatctaacct tttattgaat tagagaattt tagtcaacca ttaaaaatta cgaattttcc 4740 
tagtttatac tatttagaag ttgcagtatc aatcatcacg ggaaaaaaat aaaatgatat 4800 
ttttggtttt gtccattttg tagatatttt accgacagaa gaaaaaaaca gtgatatttt 4860 
atttttttac agtattaatg gtgagacgag agagagaaac agaagtagat tggttcttat 4 920 
gtttccctaa tggagtaaga aataatactc acagttattt tgcgatctca acaaaagtta 4 980 
aaatgaatat caaccgtaag atctctttct tctgttttca cagactatga aatataaaaa 5040 
atcataatgc ctacaagcta tgcgaccgtt agttaaaaaa aaatatgaat aatattccaa 5100 
agagaaaaaa atcttggaaa tagaaaatta ggtataaaga gaaaagaggc aaataaaaat 5160 
tggggattat tagggatgag tggtcatttt caagtaatgt gtcctttgaa tcagtttctc 5220 
tctctatctc tcacagaaac ccaaaagaag tcagacattg ttataatggt gagagagact 5280 
tctcggcttc acttcctttc tcctctttta attctctttt taattcaaaa gtttaaagat 5340 
tttattaaat gtttcatact cttacactta ttcatgaatc ctttcctaga tttcactttt 5400' 
actctctgta taatttgttc ctccttaaac tcttggttca tctatttggt tacgttacat 5460 
cctaatagtc tttatcatat atacctttgg gtccttttac taatactagg taaaaacttc 5520 
tagttgtata atgcttaaaa ttacaatggg gttatgatga tattgttatt ttataatcgc 5580 
attagaattg caaacaaaaa ttgtttatgc taggataaac attttaagtg agaagactat 5640 
cccttttttt atatctatat ttagattcgg aattttctta ctaacgaaaa atatatagat 5700 
ggagaactat gacaatacac ttttgcctta caatgacatt atggttattt acctaattga 5760 
atgatacaaa atgagatgga gttgtatgaa tttatagcaa ctgtctttct gcttcttttt 5820 
ttttttttta atagagtgga gacttgaatt cnnwrttnag natgnnnncy gaattcaagt 5880 
ctccactcta ttaaaaaaaa aaaaaagaag cagaaagaca gttgctataa attcatacaa 594 0 
ctccatctca ttttgtatca ttcaattagg taaataacca taatgtcatt gtaaggcaaa 6000 
agtgtattgt catagttctc catctatata tttttcgtta gtaagaaaat tccgaatcta 6060 
aatatagata taaaaaaagg gatagtcttc tcacttaaaa tgtttatcct agcataaaca 6120 
atttttgttt gcaattctaa tgcgattata aaataacaat atcatcataa ccccattgta 6180 
attttaagca ttatacaact agaagttttt acctagtatt agtaaaagga cccaaaggta 6240 
tatatgataa agactattag gatgtaacgt aaccaaatag atgaaccaag agtttaagga 6300 
ggaacaaatt atacagagag taaaagtgaa atctaggaaa ggattcatga ataagtgtaa 6360 
gagtatgaaa catttaataa aatctttaaa cttttgaatt aaaaagagaa ttaaaagagg 6420 
agaaaggaag tgaagccgag aagtctctct caccattata acaatgtctg acttcttttg 6480 
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ggtttctgtg agagatagag agagaaactg attcaaagga cacattactt gaaaatgacc 6540 
actcatccct aataatcccc aatttttatt tgcctctttt ctctttatac ctaattttct 6600 
atttccaaga tttttttctc tttggaatat tattcatatt tttttttaac taacggtcgc 6660 
atagcttgta ggcattatga ttttttatat ttcatagtct gtgaaaacag aagaaagaga 6720 
tcttacggtt gatattcatt ttaacttttg ttgagatcgc aaaataactg tgagtattat 6780 
ttcttactcc attagggaaa cataagaacc aatctacttc tgtttctctc tctcgtctca 6840 
ccattaatac tgtaaaaaaa taaaatatca ctgttttttt crtctgtcgg taaaatatct 6900 
acaaaatgga caaaaccaaa aatatcattt tatttttttc ccgtgatgat tgatactgca 6960 
acttctaaat agtataaact aggaaaattc gtaattttta atggttgact aaaattctct 7020 
aattcaataa aaggttagat aatatatata tgcaaaaaag ctttaaaata actgaacttt 7080 
tctatttggc atttataagc gttgctgaac gaaataaaat ttgttcagtt tttttttttt 7140 
tatcaaacta attttataga tatagaaata atttagtctc cattacaaga gtaaaacaca 7200 
tcactaaaaa ctgaaagaac tgcaaacaca aacgcagtat aagagaataa tgcaaaaaac 7260 
atggggataa accgaaggtg aaagaagctt aaaacaagca gaattctccc tagctcaacg 7320 
gtgacttcga gtgacttgaa agattttaca ctgaactggg agtccatgaa caattgcaaa 7380 
accaaagagt cttcgttcca cgccaacaag agggcgttat tcaatgataa ggcagttttc 7440 
gctacaatat cttctgctac caacgtgtag gtgctagaac gataagcgat agaaatgtct 7500 
cgcaagggaa cttcaagggt tttagaagat agcatgtcca aacaagatgc aacagcaaat 7560 
ccataaacac caaagaagct ttcattccct gcaaataaag atttcccaac aacaggggna 7620 
caagatagaa catagatctc cataaaggag taacaatggc aaacaaagca caaccacaaa 7680 
caaagcatac aattagtatc cttgtgaata ccaaaagcaa ttgtgaacaa gcgagaatca 7740 
caattccaac tccatgagtt agctgaacat ttacccatct caactttaga gaacctaatt 7800 
tgttcagttt atcagatttt atttacgaag tgaacttaaa gtattttagg attcggttta 7860 
tgctattcaa accaatgtta gcacaaggaa accatgtgta tacccacctg tcaattttta 7920 
ttatatgtaa taaataaaaa ttatatatct gaattaacga aactataatg ggtgttgaaa 7980 
tatatttcac acatgaatgt gtttttttca tgagagattt tcaagaaaaa tgtaaaatga 8040 
actaaggcta ggaaaacatt aattgaatag atatgaaaga ggaataacaa aacaagttag 8100 
gactgaacat atatgtagat aggtgcaggt aacatataaa gaagtaatgt tttaaaatct 8160 
taataaatgt attgaggaga aacaacaaaa gacaactcat tgacaaatct cttctcaacc 8220 
tttttagtca gaacatctca aatcacaaac ttatccaatt tcccatatat ccaacaatta 8280 
tatataatat tatttttgta accgatggag ggaaaaagat ttccattaat tttttttgtt 8340 
cgcctaagtt aacaagagga tttaggggaa aataagcaaa atataagccg atgattaatg 8400 
aaatgaccaa gaacatgtgg ggatttcaaa ggttcatttt tgtttttctt atataataat 8460 
aaaccttaaa aatataaaag attcaagaag gtctcatgag aggaggttgg aagccatttg 8520 
tctgcagcca attttgatga ggagtagaag tagaattaaa aatctgattg tgatgatgag 8580 
gagagaatcc tgatgatgct gcagcatttg tcaacacttg gttcgtcgag gcccgaccat 8640 
caaaccggtg gttctcagcc gccggaaaca gtgagaatcc gcctccgcta ccaatgttgc 8700 
tgctgcccgg aaactaattt accaaaaaaa aatgcaattc tttagataaa atatattcaa 8760 
atgaaattat ggtatttgca tttgtggtta tttactcact tggtggtttg aattgggagt 8820 
atggttaagc attccggttt gacctaatcc aagaacttca ttagagtgga gagaatcttg 8880 
ttgttgttgg ttcatctgaa agcaaaacaa aagaaaacat gtttataaac atataaacaa 8940 
tatattaaga tataaattat tgttaataat ataagattaa accctaccct gagccgcata 9000 
tcttgacttt tatgcttcga attagccgaa tttcccaagc tcaaatcgag gttgtgatct 9060 
tgtggagtag taggattccc tgatgactct atatacacca aattcaaaat ctttaaacca 9120 
agttcgaaca ccaaagtgat gaaactaagg accaaaaaat atataaaatt actcgattaa 9180 
atgagacaat ttaccggcat tgagttcctc atcgtaaata ctcggatcaa agttggtcac 9240 
ggcgtctttg ccgttacatt tgattgcagc tttatcgtaa gctctgtgat tatatataaa 9300 
ataatgaaaa tcaattagta acgagaaaaa cacattatac aacaatgtgt gttgtagaat 9360 
caaacaaaaa gacatttacc tagcagcttc gacctcggtg tcgaacaaac ccaaataaac 9420 
atacctattt tataagattc aagaaacaaa aactcaaagt taaaataaca aactaaaaat 9480 
gaagttttcg agtgaatata aaatgagaga aattatactt tttgcctaag aattgaccca 9540 
ttcgagcttc ccaacgacca cacttatgca aagtgacacc tctatacttc gaacttcctc 9600 
gagggaagcc tgtgctttgt cggcgaagta cgtgtacgaa ctcttccttg gttaaattag 9660 
tcatctataa aattcaattt caatattaat ctcccaaacc tttaaaaatc gttaataaaa 9720 
accaatatag tttataattt atatttacct gtttcaagtc atcatcataa tcttcgatgt 9780 
tgaaattgat atccgcttct actccacgga atttaatagc agctctatca tatgctctat 9840 
taaataaaac aaaattaaaa ttaattttta taaatttctt ccggtatttt taaaaaaata 9900 
aataaataac aactcaatat agagagagtc aaagagagaa aaataccgag ctgctgcatg 9960 
agcagtgtca aatccaccta ttcagagntt ataataaatt aaaattatga tgacattttt 10020 
caacaccaaa atacaaaaaa aaacatgagg aaaataaaat tacctaagta aacttgtttc 10080 
ccacagtccc tgaaattcag aagaacaaac aaaaaaaaac ttaaatttat aatctaaagt 10140 
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acgatcgatt aacgaaatta aaaatatgag attattacca aatatgagat tcccatcttc 10200 
cggtacgccg gtaaaacgta acacctctat actgagaact tcttgatctt ggtccacgcc 10260 
gactcttttt caacggctgt gccggctcca ctacggcagc ggcaacgttg gtagctttac 10320 
ccgcggacga tccggtggct agatccgact gacaaaactt aacaccaaac cagtgagccc 10380 
gaggaaagcc agaagcaaca ccaccgccgt tagaatccat ctcagggaag aactgatggg 10440 
tgacaagtgg gttattgggt ctgacccggt taagttcatc gtcatcggat ccatcttcga 10500 
taacaacagc tgaagagcta gaattagaga aagatccaac ccgtttactt ggtgaagaat 10560 
aacaaaactc ttcagattct tcttctcttt gtgtttggtg tggtgcgtcg tttagatccc 10620 
acattttttt tgtttttttt tggtttcttg attttgatct ttgaaggttc aaacttcaat 10680 
cctctggtat gattctcctc aaggttagaa gctaaataag agaaaccaaa gagaaaagct 10740 
tttggttggt agaaaataaa ataaaaactt taatgaaaac aaaaaaaaaa gagctaaaga 10800 
gagagagaga agaaaataaa ataaaaattt gatagattgc ttctacgaaa gctttcattt 10860 
ttatcctttc ctttatattt acttccactg ccctttctct ctctctcttc tcttgtcact 10920 
gttttttttg taaaaattta ataaaataat attattctct ctctaaccac ttgctttctt 10980 
tggaaaaaag attatattcg aaacaattcc tcgtattgga gtttctttat tttttttctt 11040 
tattttaaaa aaaaatattt ccactttcat ttattatcta tgtatacaaa catcttcttt 11100 
gattaatcaa aagattcctt taatatgtaa tttcatgttt taagaaagtg aagtccacgg 11160 
cttggttcgg taagcaacta taaaaaggtg ttttttgcaa gaaaatgaat aatctagtag 11220 
attacttcaa aaccccacaa tactgccaat tttgatatac tttaaactct tggaacaaag 11280 
ctttttgttt gttttttcct ttttctatat gaataaaagg taataagata caaatacaaa 11340 
tagatctata tttgttaaag atgtatattt atgtacaata aataggtgtt ttcatgatcg 11400 
ggcaaattca aagtagatgg gatattaagt ttttgatact aaatagatta tttatatgtt 114 60 
tatttaaagt ggagttccat aaacagcaat tccttgaaca caagcaagca atagggttct 11520 
cactatataa ggagttataa cattttctaa atggcaataa cttgttatca agtttgtatt 11580 
ctatttccat taaaatactt tgcccaaact tatactatgt ctgaaatcat aattcaattg 11640 
attttaaact atctgttttc ttaatgagtt attccaacca aaggcttttg aactcattgt .11700 
tcaaactata ttattgctta c 11721 

<210> 4 
<211> 67 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (67) 

<223> AP2-R1 AP2-like subclass AP2 domain repeat, amino 
acid positions 129-195 

<400> 4 

Ser Ser Gin Tyr Arg Gly Val Thr Phe Tyr Arg Arg Thr Gly Arg Trp 
1 5 10 15 

Glu Ser His He Trp Asp Cys Gly Lys Gin Val Tyr Leu Gly Gly Phe 
20 ' 25 30 

Thr Asp Ala His Ala Ala Ala Arg Ala Tyr Asp Arg Ala Ala He Lys 
35 40 45 

Phe Arg Gly Val Glu Ala Asp He Asn Phe Asn He Asp Asp Tyr Asp 
50 " 55 60 

Asp Asp Leu 
65 



<210> 5 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
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<220> 

<221> DOMAIN 
<222> (1) . - (68) 

<223> AP2-R2 AP2-like subclass AP2 domain repeat, amino 
acids 221-288 

<400> 5 

Ser Ser Lys Tyr Arg Gly Val Thr Leu His Lys Cys Gly Arg Trp Glu 
15 10 15 

Ala Arg Met Gly Gin Phe Leu Gly Lys Lys Tyr Val Tyr Leu Gly Leu 
20 25 30 

Phe Asp Thr Glu Val Glu Ala Ala Arg Ala Tyr Asp Lys Ala Ala He 
35 40 45 

Lys Cys Asn Gly Lys Asp Ala Val Thr Asn Phe Asp Pro Ser He Tyr 
50 " 55 60 

Asp Glu Glu Leu 
65 



<210> 6 
<211> 18 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (18) 

<223> putative AP2-R1 amphipathic alpha-helix, amino 
acids 160-177 

<400> 6 

Phe Asp Thr Ala His Ala Ala Ala Arg Ala Tyr Asp Arg Ala Ala He 
1 5 10 15 

Lys Phe 



<210> 7 
<211> 18 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (18) 

<223> putative AP2-R2 amphipathic alpha-helix, amino 
acids 253-270 

<400> 7 

Phe Asp Thr Glu Val Glu Ala Ala Arg Ala Tyr Asp Lys Ala Ala He 
1 5 10 15 

Lys Cys 
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<210> 8 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :AP2-like 
subclass AP2 domain conserved RAYD element 

<400> 8 

Arg Ala Tyr Asp 
1 



<210> 9 
<211> 77 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (77) 

<223> ANT-R1 AP2-like subclass AP2 domain repeat 
<400> 9 

Thr Ser Gin Tyr Arg Gly Val Thr Arg His Arg Trp Thr Gly Arg Tyr 
15 10 15 

Glu Ala His Leu Trp Asp Asn Ser Phe Lys Lys Glu Gly His Ser Arg 
20 25 30 

Lys Gly Arg Gin Val Tyr Leu Gly Gly Tyr Asp Met Glu Glu Lys Ala 
35 40 45 

Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr Trp Gly Pro Ser Thr 
50 55 60 

His Thr Asn Phe Ser Ala Glu Asn Tyr Gin Lys Glu He 
65 70 75 



<210> 10 
<211> 69 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (69) 

<223> ANT-R2 AP2-like subclass AP2 domain repeat 
<400> 10 

Ala Ser He Tyr Arg Gly Val Thr Arg His His Gin His Gly Arg Trp 
1 - 5 10 15 

Gin Ala Arg He Gly Arg Val Ala Gly Asn Lys Asp Leu Tyr Leu Gly 
20 ' 25 30 

Thr Phe Gly Thr Gin Glu Glu Ala Ala Glu Ala Tyr Asp Val Ala Ala 
35 40 45 



WO 99/41974 

He Lys Phe Arg Gly Thr Asn Ala Val 
50 55 

Tyr Asp Val Asp Arg 
65 
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Thr Asn Phe Asp He Thr Arg 
60 



<210> 11 
<211> 67 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (67) 

<223> RAP2.7-R1 AP2-like subclass AP2 domain repeat 
<400> 11 

Ser Ser Gin Tyr Arg Gly Val Thr Phe Tyr Arg Arg Thr Gly Arg Trp 
15 10 15 

Glu Ser His He Trp Asp Cys Gly Lys Gin Val Tyr Leu Gly Gly Phe 
20 25 30 

Asp Thr Ala His Ala Ala Ala Arg Ala Tyr Asp Arg Ala Ala He Lys 
35 40 45 

Phe Arg Gly Val Asp Ala Asp He Asn Phe Thr Leu Gly Asp Tyr Glu 
50 " 55 60 

Glu Asp Met 
65 



<210> 12 
<211> 53 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (53) 

<223> RAP2.7-R2 AP2-like subclass AP2 domain repeat 
<400> 12 

Ser Ser Lys Tyr Arg Gly Val Thr Leu His Lys Cys Gly Arg Trp Glu 
1 5 10 15 

Ala Arg Met Gly Gin Phe Leu Gly Lys Lys Ala Tyr Asp Lys Ala Ala 
20 25 30 

He Asn Thr Asn Gly Arg Glu Ala Val Thr Asn Phe Glu Met Ser Ser 
35 40 45 



Tyr Gin Asn Glu He 
50 
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<210> 13 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :AP2-like 

subclass AP2 domain conserved YRG element YRG 
motif consensus sequence 

<400> 13 

Tyr Arg Gly Val Thr 
1 5 



<210> 14 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: AP2-like 
subclass AP2 domain conserved YRG element 
WEAR /WE SH motif consensus sequence 

<220> 

<221> MOD_RES 
<222> (5) 

<223> Xaa = Ala or Ser 
<220> 

<221> MOD_RES 
<222> (6) 

<223> Xaa - Arg or His 
<400> 14 

Gly Arg Trp Glu Xaa Xaa 
1 5 



<210> 15 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :AP2-like 
subclass AP2 domain RAYD element consensus 
sequence 

<400> 15 
Val Tyr Leu Gly 
1 



<210> 16 
<211> 4 
<212> PRT 

<213> Artificial Sequence 



.nwvm ITT CTTTPT /T1TTI IP 1&\ 



W099/41974 



11 



PCT/US99/03429 



<220> 

<223> Description of Artificial Sequence :AP2-like 
subclass AP2 domain conserved RAYD element 
consensus sequence 

<400> 16 
Ala Ala lie Lys 
1 



<210> 17 
<211> 69 
<212> PRT 

<213> Nicotiana tabacum 
<220> 

<221> DOMAIN 
<222> (1) . . (69) 

<223> EREBP-1 EREBP-like subclass AP2 domain 



<400> 17 

Gly Arg His Tyr Arg Gly Val Arg 
1 5 

Ala Glu lie Arg Asp Pro Ala Lys 
20 



Arg Arg Pro Trp Gly Lys Phe Ala 
10 15 



Asn Gly Ala Arg Val Trp Leu Gly 
25 30 



Thr Tyr Glu Thr Asp Glu Glu Ala Ala lie Ala Tyr Asp Lys Ala Ala 
35 AO 45 

Tyr Arg Met Arg Gly Ser Lys Ala His Leu Asn Phe Pro Leu Glu Val 
50 " " 55 60 

Ala Asn Phe Lys Gin 
65 



<210> 18 
<211> 69 
<212> PRT 

<213> Nicotiana tabacum 



<220> 

<221> DOMAIN 
<222> (1) . . (69) 

<223> EREBP-2 EREBP-like subclass AP2 domain 



<400> 18 

Gly Arg His Tyr Arg Gly Val Arg 
1 5 

Ala Glu He Arg Asp Pro Ala Lys 
20 

Thr Tyr Glu Thr Ala Glu Glu Ala 
35 40 

Tyr Arg Met Arg Gly Ser Lys Ala 
50 55 



Gin Arg Pro Trp Gly Lys Phe Ala 
10 15 

Asn Gly Ala Arg Val Trp Leu Gly 
25 30 

Ala Leu Ala Tyr Asp Lys Ala Ala 
45 



Leu Leu Asn Phe Pro His Arg He 
60 
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Gly Leu Asn Glu Pro 
65 



<210> 19 
<211> 68 
<212> PRT 

<213> Nicotiana tabacum 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> EREBP-3 EREBP-like subclass AP2 domain 
<400> 19 

Glu Val His Tyr Arg Gly Val Arg Lys Arg Pro Trp Gly Arg Tyr Ala 
15 10 15 

Ala Glu He Arg Asp Pro Gly Lys Lys Ser Arg Val Trp Leu Gly Thr 
20 25 30 

Phe Asp Thr Ala Glu Glu Ala Ala Lys Ala Tyr Asp Thr Ala Ala Arg 
35 40 45 

Glu Phe Arg Gly Pro Lys Ala Lys Thr Asn Phe Pro Ser Pro Thr Glu 
50 " 55 60 

Asn Gin Ser Pro 
65 



<210> 20 
<211> 69 
<212> PRT 

<213> Nicotiana tabacum 



<220> 

<221> DOMAIN 
<222> (1) . . (69) 

<223> EREBP-4 EREBP-like subclass AP2 domain 
<400> 20 

Lys Lys His Tyr Arg Gly Val Arg Gin Arg Pro Trp Gly Lys Phe Ala 
1 5 10 15 

Ala Glu He Arg Asp Pro Asn Arg Lys Gly Thr Arg Val Trp Leu Gly 
20 * 25 30 

Thr Phe Asp Thr Ala He Glu Ala Ala Lys Ala Tyr Asp Arg Ala Ala 
35 40 45 

Phe Lys Leu Arg Gly Ser Lys Ala He Val Asn Phe Pro His Arg He 
50 " ~ 55 , 60 

Gly Leu Asn Glu Pro 
65 
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<210> 21 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.2 EREBP-like subclass AP2 domain 
<400> 21 

Lys Asn Gin Tyr Arg Gly He Arg Gin Arg Pro Trp Gly Lys Trp Ala 
15 10 15 

Ala Glu He Arg Asp Pro Arg Lys Gly Ser Arg Glu Trp Leu Gly Thr 
20 25 30 

Phe Asp Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Ala Ala Ala Arg 
35 40 45 

Arg He Arg Gly Thr Lys Ala Lys Val Asn Phe Pro Glu Glu Lys Asn 
50 " 55 60 

Pro Ser Val Val 
65 



<210> 22 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.3 EREBP-like subclass AP2 domain 



<400> 22 

Lys Asn Val Tyr Arg Gly He Arg 
1 5 

Ala Glu He Arg Asp Pro Arg Lys 
20 

Phe Asn Thr Ala Glu Glu Ala Ala 
35 40 

Gin He Arg Gly Asp Lys Ala Lys 
50 55 

Pro Pro Pro Pro 
65 



Lys Arg Pro Trp Gly Lys Trp Ala 
10 * 15 

Gly Val Arg Val Trp Leu Gly Thr 
25 30 

Met Ala Tyr Asp Val Ala Ala Lys 
45 

Leu Asn Phe Pro Asp Leu His His 
60 



<210> 23 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
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<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.5 EREBP-like subclass AP2 domain 
<400> 23 

Glu lie Arg Tyr Arg Gly Val Arg Lys Arg Pro Trp Gly Arg Tyr Ala 
1 5 10 15 

Ala Glu He Arg Asp Pro Gly Lys Lys Thr Arg Val Trp Leu Gly Thr 
20 25 30 

Phe Asp Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Thr Ala Ala Arg 
35 40 45 

Asp Phe Arg Gly Ala Lys Ala Lys Thr Asn Phe Pro Thr Phe Leu Glu 
50 55 60 

Leu Ser Asp Gin 
65 



<210> 24 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . - (68) 

<223> RAP2.6 EREBP-like subclass AP2 domain 
<400> 24 

Pro Lys Lys Tyr Arg Gly Val Arg Gin Arg Pro Trp Gly Lys Trp Ala 
1 5 10 15 

Ala Glu lie Arg Asp Pro His Lys Ala Thr Arg Val Trp Leu Gly Thr 
20 25 30 

Phe Glu Thr Ala Glu Ala Ala Ala Arg Ala Tyr Asp Ala Ala Ala Leu 
35 40 45 

Arg Phe Arg Gly Ser Lys Ala Lys Leu Asn Phe Pro Glu Asn Val Gly 
50 " 55 60 

Thr Gin Thr He 
65 



<210> 25 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.12 EREBP-like subclass AP2 domain 
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<400> 25 

Lys Asn Gin Tyr Arg Gly He Arg Gin Arg Pro Trp Gly Lys Trp Ala 
15 10 15 

Ala Glu He Arg Asp Pro Arg Glu Gly Ala Arg He Trp Leu Gly Thr 
20 25 30 

Phe Lys Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Ala Ala Ala Arg 
35 40 45 

Arg He Arg Gly Ser Lys Ala Lys Val Asn Phe Pro Glu Glu Asn Met 
50 55 60 

Lys Ala Asn Ser 
65 



<210> 26 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> TINY EREBP-like subclass AP2 domain 
<400> 26 

His Pro Val Tyr Arg Gly Val Arg Lys Arg Asn Trp Gly Lys Trp Val 
15 10 15 

Ser Glu He Arg Glu Pro Arg Lys Lys Ser Arg lie Trp Leu Gly Thr 
20 25 30 

Phe Pro Ser Pro Glu Met Ala Ala Arg Ala His Asp Val Ala Ala Leu 
35 40 45 

Ser He Lys Gly Ala Ser Ala He Leu Asn Phe Pro Asp Leu Ala Gly 
50 55 60 

Ser Phe Pro Arg 
65 



<210> 27 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.1 EREBP-like subclass AP2 domain 
<400> 27 

Arg Lys Pro Tyr Arg Gly He Arg Arg' Arg Lys Trp Gly Lys Trp Val 
15 10 15 

Ala Glu He Arg Glu Pro Asn Lys Arg Ser Arg Leu Trp Leu Gly Ser 
20 25 30 
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Tyr Thr Thr Asp lie Ala Ala Ala Arg Ala Tyr Asp Val Ala Val Phe 
35 ' 40 45 

Tyr Leu Arg Gly Pro Ser Ala Arg Leu Asn Phe Pro Asp Leu Leu Leu 
50 " J 55 60 

Gin Glu Glu Asp 
65 



<210> 28 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.4 EREBP-like subclass AP2 domain 
<400> 28 

Thr Lys Leu Tyr Arg Gly Val Arg Gin Arg His Trp Gly Lys Trp Val 
1 5 10 15 

Ala Glu He Arg Leu Pro Arg Asn Arg Thr Arg Leu Trp Leu Gly Thr 
20 25 30 

Phe Asp Thr Ala Glu Glu Ala Ala Leu Ala Tyr Asp Lys Ala Ala Tyr 
35 40 45 

Lys Leu Arg Gly Asp Phe Ala Arg Leu Asn Phe Pro Asn Leu Arg His 
50 55 60 

Asn Gly Phe His 
65 



<210> 29 
<211> 66 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (66) 

<223> RAP2.8 EREBP-like subclass AP2 domain 
<400> 29 

Ser Ser Lys Tyr Lys Gly Val Val Pro Gin Pro Asn Gly Arg Trp Gly 
15 10 15 

Ala Gin He Tyr Glu Lys His Gin Arg Val Trp Leu Gly Thr Phe Asn 
20 25 30 

Glu Gin Glu Glu Ala Ala Arg Ser Tyr Asp He Ala Ala Cys Arg Phe 
35 40 45 

Arg Gly Arg Asp Ala Val Val Asn Phe Lys Asn Val Leu Glu Asp Gly 
50 55 60 
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Asp Leu 
65 



<210> 30 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.10 EREBP-like subclass AP2 domain 
<400> 30 

Asp Lys Pro Tyr Lys Gly He Arg Met Arg Lys Trp Gly Lys Trp Val 
1 5 10 15 

Ala Glu He Arg Glu Pro Asn Lys Arg Ser Arg He Trp Leu Gly Ser 
20 25 30 

Tyr Ser Thr Pro Glu Ala Ala Ala Arg Ala Tyr Asp Thr Ala Val Phe 
35 40 45 

Tyr Leu Arg Gly Pro Ser Ala Arg Leu Asn Phe Pro Glu Leu Leu Ala 
50 55 60 

Gly Val Thr Val 
65 



<210> 31 
<211> 68 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> DOMAIN 
<222> (1) . . (68) 

<223> RAP2.11 EREBP-like subclass AP2 domain 
<400> 31 

Lys Thr Lys Phe Val Gly Val Arg Gin Arg Pro Ser Gly Lys Trp Val 
1 5 10 15 

Ala Glu He Lys Asp Thr Thr Gin Lys He Arg Met Trp Leu Gly Thr 
20 25 30 

Phe Glu Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Glu Ala Ala Cys 
35 40 45 

Leu Leu Arg Gly Ser Asn Thr Arg Thr Asn Phe Ala Asn His Phe Pro 
50 * 55 60 

Asn Asn Ser Gin 
65 
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<210> 32 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : EREBP-like 
subclass AP2 domain conserved YRG element YRG 
motif consensus sequence 

<220> 

<221> MOD_RES 
<222> (4) 

<223> Xaa = Val or He 
<400> 32 

Tyr Arg Gly Xaa Arg 
1 5 



<210> 33 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : EREBP-like 

subclass AP2 domian conserved YRG element WAAEIRD 
box motif consensus sequence 

<220> 

<221> MOD_RES 
<222> (3) 

<223> Xaa = positively charged amino acid 
<220> 

<221> MOD_RES 
<222> (4) 

<223> Xaa = Trp, Phe or Tyr 
<220> 

<221> MOD_RES 
<222> (5) 

<223> Xaa - Ala or Val 
<220> 

<221> MOD_RES 
<222> (9) 

<223> Xaa = Arg or Lys 
<220> 

<221> MODJIES 
<222> (10) 

<223> Xaa = Asp or Glu 
<400> 33 

Trp Gly Xaa Xaa Xaa Ala Glu He Xaa Xaa 
1 5 10 
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<210> 34 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : EREBP-like 
subclass AP2 domain conserved RAYD element 
consensus sequence 

<220> 

<221> MOD_RES 
<222> (4) 

<223> Xaa = Thr or Ser 
<220> 

<221> MOD_RES 
<222> (5) 

<223> Xaa = Phe or Tyr 
<400> 34 

Trp Leu Gly Xaa Xaa 
1 5 



<210> 35 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : EREBP-like 

subclass AP2 domain conserved RAYD element RAYD 
consensus sequence 

<220> 

<221> MOD_RES 
<222> (5) 

<223> Xaa = positively charged amino acid, lie or Leu 
<400> 35 

Glu Glu Ala Ala Xaa Ala Tyr Asp 
1 5 



<210> 36 
<211> 17 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (17) 

<223> putative RAP2.7-R1 amphipathic alpha-helix 
<400> 36 

Asp Thr Ala His Ala Ala Ala Arg Ala Tyr Asp Arg Ala Ala lie Lys 
15 10 15 

Phe 
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<210> 37 
<211> 16 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (16) 

<223> putative ANT-R1 amphipathic alpha-helix 
<400> 37 

Met Glu Glu Lys Ala Ala Arg Ala Tyr Asp Leu Ala Ala Leu Lys Tyr 
1 " 5 10 15 



<210> 38 
<211> 18 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (18) 

<223> putative RAP2 . 2 amphipathic alpha-helix 
<400> 38 

Asp Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Ala Ala Ala Arg Arg 
15 10 15 

He Arg 



<210> 39 
<211> 16 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (16) 

<223> putative RAP2 . 5 amphipathic alpha-helix 
<400> 39 

Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Thr Ala Ala Arg Asp Phe 
15 10 15 



<210> 40 
<211> 18 
<212> PRT . 

<213> Arabidopsis thaliana 
<220> 

<221> HELIX 
<222> (1) . . (18) 

<223> putative RAP2.12 amphipathic alpha-helix 
<400> 40 

Lys Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Ala Ala Ala Arg Arg 
\ 5 10 15 
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<210> 41 
<211> 16 
<212> PRT 

<213> Nicotiana tabacum 
<220> 

<221> HELIX 
<222> (1) . . (16) 

<223> putative EREBP-3 amphipathic alpha-helix 
<400> 41 

Thr Ala Glu Glu Ala Ala Lys Ala Tyr Asp Thr Ala Ala Arg Glu Phe 
15 10 15 



<210> 42 
<211> 25 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> PEPTIDE 

<222> (1) . . (25) 

<223> AP2 linker region 

<400> 42 

Lys Gin Met Thr Asn Leu Thr Lys 
1 5 

Arg Gin Ser Thr Gly Phe Pro Arg 
20 



Glu Glu Phe Val His Val Leu Arg 
10 15 



Gly 
25 



<210> 43 
<211> 26 
<212> PRT 

<213> Arabidopsis thaliana 
<220> 

<221> PEPTIDE 

<222> (1) . . (26) 

<223> ANT linker region 

<400> 43 

Glu Asp Met Met Lys Asn Met Thr 
1 5 

Arg Arg Lys Ser Ser Gly Phe Ser 
20 



Arg Gin Glu Tyr Val Ala His Leu 
10 15 

Arg Gly 
25 



<210> 44 
<211> 26 
<212> PRT 

<213> Arabidopsis thaliana 
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<220> 

<221> PEPTIDE 

<222> (1) . . (26) 

<223> RAP2.7 linker region 

<400> 44 

Met Lys Gin Val Gin Asn Leu Ser Lys Glu Glu Phe Val His He Leu 
1 5 10 15 

Arg Arg Gin Ser Thr Gly Phe Ser Arg Gly 
20 25 



<210> 45 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : consensus 
linker region motif 

<220> 

<221> M0D_RES 
<222> (4) 

<223> Xaa = positively charged amino acid 
<400> 45 

Asn Leu Thr Xaa Glu Glu Phe Val His 
1 5 



<210> 46 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: consensus 
linker region motif 

<400> 46 

Leu Arg Arg Gin Ser Thr Gly Phe Ser Arg Gly 
15 10 



<210> 47 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0AP2U primer 
<400> 47 

gttgccgctg ccgtagtg 18 

<210> 48 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



cttooTITI I TV CTTTTVT rDTTT .V 
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<220> 

<223> Description of Artificial 
<400> 48 

ggttcatcct gagccgcata tc 

<210> 49 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
primer 

<400> 49 

ctcaagaaga agtgcctaac cacg 

<210> 50 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
primer 

<400> 50 

gcagaagcta gaagagcgtc ga 

<210> 51 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
primer 

<400> 51 

ggaaaatggg ctgcggag 

<210> 52 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
primer 

<400> 52 

gttacctcca gcatcgaacg ag 

<210> 53 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



Sequence : J0AP2L primer 

22 

Sequence : J0RAP2 . 1U 

24 

Sequence : JORAP2 . 1L 

22 

Sequence : JORAP2 . 2U 

18 

Sequence : JORAP2 . 2L 
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<220> 

<223> Description of Artificial Sequence : JORAP2 . 4U 
primer 

<400> 53 

gctggatctt gtttcgctta eg 

<210> 54 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 4L 
primer 

<400> 54 

gcttcaagct tagegtcgae tg 

<210> 55 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 5U 
primer 

<400> 55 

agatgggctt gaaacccgac 
» 

<210> 56 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 5L 
primer 

<400> 56 

ctggctaggg ctacgcgc 

<210> 57 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 6U 
primer 

<400> 57 

ttctttgect cctcaaccat tg 

<210> 58 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : J0RAP2 . 6L 
primer 

<400> 58 

tctgagttcc aacattttcg gg 

<210> 59 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 7U 
primer 

<400> 59 

gaaattggta actccggttc eg 

<210> 60 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 7L 
primer 

<400> 60 

ecattttget ttggcgcatt ac 

<210> 61 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : JORAP2 . 8U 
primer 

<400> 61 

ggcgttacgc ctctaccgg 

<210> 62 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: JORAP2.8L 
primer 

<400> 62 

cgccgtcttc cagaaegtte 

<210> 63 

<211> 21 

<212> DNA 

<213> Artificial Sequence 



/i«mnr|i|<l<l !'■«■,» OTTT?1?nr /tlTTT 1? 
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<220> 

<223> Description of Artificial Sequence : J0RAP2 . 9U 
primer 

<400> 63 

atcacggatc tggcttggtt c 

<210> 64 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 9L 
primer 

<400> 64 

gccttcttcc gtatcaacgt eg 

<210> 65 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 10U 
primer 

<400> 65 

gtcaactccg gcggttacg 

<210> 66 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 10L 
primer 

<400> 66 

tctccttata tacgccgccg a 

<210> 67 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 11U 
primer 

<400> 67 

gagaagagca aaggcaacaa gac 

<210> 68 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : J0RAP2 . 11L 
primer 

<400> 68 

agttgttagg aaaatggttt gcg 

<210> 69 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : J0RAP2 . 12U 
primer 

<400> 69 

aaaccattcg ttttcacttc gactc 

<210> 70 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: JORAP2 . 12L 
primer 

<400> 70 

tcacagagcg tttctgagaa ttagc 

<210> 71 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : AP2U primer 
<400> 71 

atgtgggatc taaacgacgc ac 

<210> 72 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :AP2L primer 
<400> 72 

gatcttggtc cacgccgac 

<210> 73 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 1U primer 



nvmrivi'lil I'l'L' CTTPT7T /"DTTT T ^ A\ 
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<400> 73 

aagaggacca tctctcag 

<210> 74 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 1L primer 
<400> 74 

aacactcgct agcttctc 

<210> 75 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 2U primer 
<400> 75 

tggttcagca gccaacac 

<210> 76 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 2L primer 
<400> 76 

caatgcatag agcttgagg 

<210> 77 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 4 U primer 

<400> 77 

acggatttca catcggag 

<210> 78 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 4L primer 
<400> 78 

ctaagctaga atcgaatcc 
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<210> 79 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 5U primer 
<400> 79 

taccggtttc gcgcgtag 

<210> 80 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 5L primer 
<400> 80 

caccttcgaa atcaacgacc g 

<210> 81 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 6U primer 
<400> 81 

ttccccgaaa atgttggaac tc 

<210> 82 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 6L primer 
<400> 82 

tgggagagaa aaaattggta gatcg 

<210> 83 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 7U primer 
<400> 83 

cgatggagac gaagactc 

<210> 84 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : RAP2 . 7L primer 
<400> 84 

gtcggaaccg gagttacc 

<210> 85 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 8U primer 
<400> 85 

tcactcaaag gccgagatc 

<210> 86 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 8L primer 

<400> 86 

taacaacatc accggctcg 

<210> 87 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2. 9U primer 
<400> 87 

gtgaaggctt aggaggag. 

<210> 88 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 9L primer 
<400> 88 

tgcctcatat gagtcagag 

<210> 89 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :RAP2. 10U primer 
<400> 89 

tcccggagct tttagccg 
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<210> 90 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 10L primer 
<400> 90 

caacccgttc caacgatcc 19 

<210> 91 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 110 primer 
<400> 91 

ttcttcacca gaagcagagc atg 23 

<210> 92 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 11L primer 
<400> 92 

ctccattcat tgcatatagg gacg 24 

<210> 93 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : RAP2 . 12U primer 
<400> 93 

gctttggttc agaactcgaa catc 24 

<210> 94 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :RAP2.12L primer 
<400> 94 

aggttgataa acgaacgatg eg 22 

<210> 95 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



nTmPTTTfTTT CUT 17 HP n>TTT X* 
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<220> 

<223> Description of Artificial Sequence: Primer RISZU 1 
<400> 95 

ggaytgtggg aaacaagttt a 21 

<210> 96 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer RISZU 2 
<400> 96 

tgcaaagtra cacctctata ctt 23 

<210> 97 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Primer RISZU 3 
<400> 97 

gcatgwgcag tgtcaaatcc a 21 

<210> 98 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Primer RISZU 4 
<400> 98 

gaggaagttc vaagtataga 20 

<210> 99 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: putative 
nuclear localization sequence 

<400> 99 
Lys Lys Ser Arg 
1 



<210> 100 
<211> 18 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : RAP2 . 3 primer 



CUTFT TOTTT U* 1£\ 
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<400> 100 

tcatcgccac gatcaacc 

<210> 101 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :RAP2. 3 primer 
<400> 101 

agcagtccaa tgcgacgg 18 

<210> 102 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :EREBP-like 

subclass AP2 domain conserved YRG element WAAEIRD 
box motif 

<400> 102 

Trp Ala Ala Glu He Arg Asp 
1 5 



<210> 103 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: MADS box domain 

<400> 103 
Met Ala Asp Ser 
1 



<210> 104 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :AP2-like 
subclass AP2 domain conserved YRG element 
WEAR/ WE SH motif 

<220> 

<221> MOD_RES 
<222> (3) 

<223> Xaa = Ala or Ser 
<220> 

<221> M0D_RES 
<222> (4) 

<223> Xaa = Arg or His 
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<400> 104 
Trp Glu Xaa Xaa 
1 
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